2025-07-17T08:41:12.2576853Z Current runner version: '2.326.0' 2025-07-17T08:41:12.2582327Z Runner name: 'pytorch-rocm-hw-43' 2025-07-17T08:41:12.2583006Z Runner group name: 'linux.rocm.gpu.group' 2025-07-17T08:41:12.2583739Z Machine name: 'pytorch-rocm-hw-43' 2025-07-17T08:41:12.2586162Z ##[group]GITHUB_TOKEN Permissions 2025-07-17T08:41:12.2587936Z Contents: read 2025-07-17T08:41:12.2588418Z Metadata: read 2025-07-17T08:41:12.2588874Z ##[endgroup] 2025-07-17T08:41:12.2590675Z Secret source: Actions 2025-07-17T08:41:12.2591366Z Prepare workflow directory 2025-07-17T08:41:12.3118429Z Prepare all required actions 2025-07-17T08:41:12.3168065Z Getting action download info 2025-07-17T08:41:12.7766579Z Download action repository 'pytorch/pytorch@main' (SHA:39ac189808c61588f3594dbc2fc1d69bb6194c47) 2025-07-17T08:41:17.8355313Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-07-17T08:41:18.3854251Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-07-17T08:41:18.8776476Z Download action repository 'pytorch/test-infra@main' (SHA:a9ec424ad5e5851e47d68139cfd953b4031778d5) 2025-07-17T08:41:19.8575273Z Download action repository 'actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-07-17T08:41:20.6249975Z Getting action download info 2025-07-17T08:41:20.7976821Z Download action repository 'actions/checkout@v4' (SHA:11bd71901bbe5b1630ceea73d27597364c9af683) 2025-07-17T08:41:21.4834328Z Getting action download info 2025-07-17T08:41:21.6721600Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-07-17T08:41:22.1736070Z Getting action download info 2025-07-17T08:41:22.3862651Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/main (a38f433be2e94a64b095a44ba39879d02d0c2316) 2025-07-17T08:41:22.3870390Z ##[group] Inputs 2025-07-17T08:41:22.3871046Z build-environment: linux-jammy-rocm-py3.10 2025-07-17T08:41:22.3872407Z test-matrix: {"include": [{"config": "inductor", "shard": 1, "num_shards": 2, "runner": "linux.rocm.gpu.2"}, {"config": "inductor", "shard": 2, "num_shards": 2, "runner": "linux.rocm.gpu.2"}]} 2025-07-17T08:41:22.3874497Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:22.3875792Z sync-tag: 2025-07-17T08:41:22.3877248Z timeout-minutes: 300 2025-07-17T08:41:22.3877724Z tests-to-include: 2025-07-17T08:41:22.3878134Z dashboard-tag: 2025-07-17T08:41:22.3879157Z disable-monitor: true 2025-07-17T08:41:22.3879843Z monitor-log-interval: 5 2025-07-17T08:41:22.3880371Z monitor-data-collect-interval: 1 2025-07-17T08:41:22.3880908Z ##[endgroup] 2025-07-17T08:41:22.3881589Z Complete job name: rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:41:22.5023440Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-07-17T08:41:22.5024020Z with: 2025-07-17T08:41:22.5024186Z no-sudo: true 2025-07-17T08:41:22.5024367Z submodules: recursive 2025-07-17T08:41:22.5024556Z fetch-depth: 0 2025-07-17T08:41:22.5024865Z env: 2025-07-17T08:41:22.5025048Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:22.5025246Z ##[endgroup] 2025-07-17T08:41:22.5096881Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-07-17T08:41:22.5097612Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-07-17T08:41:22.5125339Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:22.5125637Z env: 2025-07-17T08:41:22.5125807Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:22.5125997Z ##[endgroup] 2025-07-17T08:41:22.5444005Z ##[group]Run # Use all available CPUs for fetching 2025-07-17T08:41:22.5445380Z # Use all available CPUs for fetching 2025-07-17T08:41:22.5446001Z cd "${GITHUB_WORKSPACE}" 2025-07-17T08:41:22.5446582Z git config --global fetch.parallel 0 2025-07-17T08:41:22.5447259Z git config --global submodule.fetchJobs 0 2025-07-17T08:41:22.5447852Z  2025-07-17T08:41:22.5448450Z # Clean workspace. The default checkout action should also do this, but 2025-07-17T08:41:22.5449262Z # do it here as well just in case 2025-07-17T08:41:22.5449821Z if [[ -d .git ]]; then 2025-07-17T08:41:22.5450377Z  if [ -z "${NO_SUDO}" ]; then 2025-07-17T08:41:22.5450939Z  sudo git clean -ffdx 2025-07-17T08:41:22.5451447Z  else 2025-07-17T08:41:22.5451868Z  git clean -ffdx 2025-07-17T08:41:22.5452302Z  fi 2025-07-17T08:41:22.5452692Z fi 2025-07-17T08:41:22.5506432Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:22.5507280Z env: 2025-07-17T08:41:22.5507725Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:22.5508259Z NO_SUDO: true 2025-07-17T08:41:22.5508897Z ##[endgroup] 2025-07-17T08:41:23.0698476Z Removing .additional_ci_files/ 2025-07-17T08:41:23.0699238Z Removing build/ 2025-07-17T08:41:23.0699657Z Removing dist/ 2025-07-17T08:41:23.0700113Z Removing test/test-reports/ 2025-07-17T08:41:23.0825348Z ##[group]Run actions/checkout@v4 2025-07-17T08:41:23.0825911Z with: 2025-07-17T08:41:23.0826344Z ref: a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:41:23.0826916Z fetch-depth: 0 2025-07-17T08:41:23.0827334Z submodules: recursive 2025-07-17T08:41:23.0827763Z show-progress: false 2025-07-17T08:41:23.0828206Z repository: pytorch/pytorch 2025-07-17T08:41:23.0828913Z token: *** 2025-07-17T08:41:23.0829381Z ssh-strict: true 2025-07-17T08:41:23.0829828Z ssh-user: git 2025-07-17T08:41:23.0830340Z persist-credentials: true 2025-07-17T08:41:23.0830930Z clean: true 2025-07-17T08:41:23.0831455Z sparse-checkout-cone-mode: true 2025-07-17T08:41:23.0832074Z fetch-tags: false 2025-07-17T08:41:23.0832449Z lfs: false 2025-07-17T08:41:23.0832837Z set-safe-directory: true 2025-07-17T08:41:23.0833256Z env: 2025-07-17T08:41:23.0833626Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:23.0834041Z ##[endgroup] 2025-07-17T08:41:23.1881234Z Syncing repository: pytorch/pytorch 2025-07-17T08:41:23.1883747Z ##[group]Getting Git version info 2025-07-17T08:41:23.1884680Z Working directory is '/home/pytorchci/actions-runner/_work/pytorch/pytorch' 2025-07-17T08:41:23.1886011Z [command]/usr/bin/git version 2025-07-17T08:41:23.1886494Z git version 2.34.1 2025-07-17T08:41:23.1888051Z ##[endgroup] 2025-07-17T08:41:23.1895189Z Copying '/home/pytorchci/.gitconfig' to '/home/pytorchci/actions-runner/_work/_temp/b1a22262-bd44-4816-ad2e-a951b9850dbb/.gitconfig' 2025-07-17T08:41:23.1897690Z Temporarily overriding HOME='/home/pytorchci/actions-runner/_work/_temp/b1a22262-bd44-4816-ad2e-a951b9850dbb' before making global git config changes 2025-07-17T08:41:23.1899356Z Adding repository directory to the temporary git global config as a safe directory 2025-07-17T08:41:23.1900653Z [command]/usr/bin/git config --global --add safe.directory /home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-07-17T08:41:23.1902462Z [command]/usr/bin/git config --local --get remote.origin.url 2025-07-17T08:41:23.1903265Z https://github.com/pytorch/pytorch 2025-07-17T08:41:23.1905862Z ##[group]Removing previously created refs, to avoid conflicts 2025-07-17T08:41:23.1907930Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-07-17T08:41:23.1935693Z HEAD 2025-07-17T08:41:23.1973052Z ##[endgroup] 2025-07-17T08:41:23.1975277Z [command]/usr/bin/git submodule status 2025-07-17T08:41:23.2354325Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-07-17T08:41:23.2455753Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-07-17T08:41:23.2545108Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-07-17T08:41:23.2669589Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-07-17T08:41:23.2745569Z 2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07 third_party/NVTX (v3.1.0-263-g2942f16) 2025-07-17T08:41:23.2834086Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-07-17T08:41:23.3312828Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-07-17T08:41:23.3351164Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-07-17T08:41:23.3395555Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-07-17T08:41:23.3477521Z 434d19f696da62c12b5372b32cbc9ba968588d7e third_party/composable_kernel (rocm-6.4.0-172-g434d19f69) 2025-07-17T08:41:23.3626096Z 3af7f2c16147f3fbc6e4d717032daf505dc1652c third_party/cpp-httplib (v0.20.1) 2025-07-17T08:41:23.3791518Z 5e3d2445e6a84d9599bee2bf78edbb4d80865e1d third_party/cpuinfo (5e3d244) 2025-07-17T08:41:23.3839969Z f937055efc6d414d11f4c6577e3977fe74f35fb6 third_party/cudnn_frontend (v0.5-52-gf937055) 2025-07-17T08:41:23.3967805Z b995f933179c22d3fe0d871c3a53d11e4681950f third_party/cutlass (v4.0.0) 2025-07-17T08:41:23.4038625Z 157e88b750c452bef2ab4653fe9d1eeb151ce4c3 third_party/fbgemm (v1.2.0-186-g157e88b7) 2025-07-17T08:41:23.4152947Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-07-17T08:41:23.4204385Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-07-17T08:41:23.4588960Z 40626af88bd7df9a5fb80be7b25ac85b122d6c21 third_party/fmt (11.2.0) 2025-07-17T08:41:23.4725059Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-07-17T08:41:23.4824124Z c7b7b022c124d9643957d9bd55f57ac59fce8fa2 third_party/gloo (remotes/origin/HEAD) 2025-07-17T08:41:23.5034666Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-07-17T08:41:23.5119878Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-07-17T08:41:23.5210796Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-07-17T08:41:23.5426751Z 5e7501833f1021ce6f618572d3baf657b6319658 third_party/kineto (remotes/origin/HEAD) 2025-07-17T08:41:23.5466546Z cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7 third_party/kleidiai (v1.8.0) 2025-07-17T08:41:23.5499088Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-07-17T08:41:23.5528348Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-07-17T08:41:23.5863755Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-07-17T08:41:23.5904919Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-07-17T08:41:23.5949086Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-07-17T08:41:23.6311706Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-07-17T08:41:23.6430314Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-07-17T08:41:23.6502619Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-07-17T08:41:23.6559768Z a2e59f0e7065404b44dfe92a28aca47ba1378dc4 third_party/pybind11 (v2.11.0-182-ga2e59f0e) 2025-07-17T08:41:23.6665572Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-07-17T08:41:23.6754873Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-07-17T08:41:23.6865964Z 52791a2fd214b2a9dc5759d36725909c1daa7f2e third_party/tensorpipe (remotes/origin/master) 2025-07-17T08:41:23.6888256Z ##[group]Cleaning the repository 2025-07-17T08:41:23.6889648Z [command]/usr/bin/git clean -ffdx 2025-07-17T08:41:23.7200399Z [command]/usr/bin/git reset --hard HEAD 2025-07-17T08:41:23.8246548Z HEAD is now at f6d138807f1 Always disable ShardingPropagation cache if compiling (#156868) 2025-07-17T08:41:23.8301815Z ##[endgroup] 2025-07-17T08:41:23.8303986Z ##[group]Disabling automatic garbage collection 2025-07-17T08:41:23.8313875Z [command]/usr/bin/git config --local gc.auto 0 2025-07-17T08:41:23.8369697Z ##[endgroup] 2025-07-17T08:41:23.8370452Z ##[group]Setting up auth 2025-07-17T08:41:23.8382544Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-07-17T08:41:23.8438705Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-07-17T08:41:23.8739623Z Entering 'android/libs/fbjni' 2025-07-17T08:41:23.8784695Z Entering 'third_party/FP16' 2025-07-17T08:41:23.8823986Z Entering 'third_party/FXdiv' 2025-07-17T08:41:23.8865049Z Entering 'third_party/NNPACK' 2025-07-17T08:41:23.8908064Z Entering 'third_party/NVTX' 2025-07-17T08:41:23.8948901Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:23.8992580Z Entering 'third_party/XNNPACK' 2025-07-17T08:41:23.9049882Z Entering 'third_party/aiter' 2025-07-17T08:41:23.9095940Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:23.9153606Z Entering 'third_party/benchmark' 2025-07-17T08:41:23.9201140Z Entering 'third_party/composable_kernel' 2025-07-17T08:41:23.9250709Z Entering 'third_party/cpp-httplib' 2025-07-17T08:41:23.9296822Z Entering 'third_party/cpuinfo' 2025-07-17T08:41:23.9345681Z Entering 'third_party/cudnn_frontend' 2025-07-17T08:41:23.9397620Z Entering 'third_party/cutlass' 2025-07-17T08:41:23.9453195Z Entering 'third_party/fbgemm' 2025-07-17T08:41:23.9501993Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:23.9550907Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:23.9599294Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:23.9636539Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:23.9681209Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:23.9724233Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:23.9761676Z Entering 'third_party/fbgemm/external/json' 2025-07-17T08:41:23.9808460Z Entering 'third_party/flash-attention' 2025-07-17T08:41:23.9849074Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:23.9901080Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:23.9957562Z Entering 'third_party/flatbuffers' 2025-07-17T08:41:24.0008280Z Entering 'third_party/fmt' 2025-07-17T08:41:24.0047848Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:24.0087454Z Entering 'third_party/gloo' 2025-07-17T08:41:24.0126706Z Entering 'third_party/googletest' 2025-07-17T08:41:24.0174472Z Entering 'third_party/ideep' 2025-07-17T08:41:24.0218911Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:24.0271566Z Entering 'third_party/ittapi' 2025-07-17T08:41:24.0316682Z Entering 'third_party/kineto' 2025-07-17T08:41:24.0375695Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:24.0420027Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:24.0463400Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:24.0503123Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:24.0543213Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:24.0580303Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:24.0622957Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:24.0671482Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:24.0716914Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:24.0763210Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:24.0810176Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:24.0853535Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:24.0901632Z Entering 'third_party/kleidiai' 2025-07-17T08:41:24.0941651Z Entering 'third_party/mimalloc' 2025-07-17T08:41:24.0987587Z Entering 'third_party/nlohmann' 2025-07-17T08:41:24.1039161Z Entering 'third_party/onnx' 2025-07-17T08:41:24.1097716Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:24.1145326Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T08:41:24.1195965Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:24.1239260Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:24.1276579Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:24.1314628Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:24.1352744Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:24.1388756Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:24.1431754Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:24.1474017Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:24.1520440Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:24.1565869Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:24.1628412Z Entering 'third_party/pocketfft' 2025-07-17T08:41:24.1677183Z Entering 'third_party/protobuf' 2025-07-17T08:41:24.1729118Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:24.1773594Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:24.1817574Z Entering 'third_party/psimd' 2025-07-17T08:41:24.1865280Z Entering 'third_party/pthreadpool' 2025-07-17T08:41:24.1906864Z Entering 'third_party/pybind11' 2025-07-17T08:41:24.1957978Z Entering 'third_party/python-peachpy' 2025-07-17T08:41:24.2000509Z Entering 'third_party/sleef' 2025-07-17T08:41:24.2047530Z Entering 'third_party/tensorpipe' 2025-07-17T08:41:24.2087342Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:24.2125418Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:24.2167595Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:24.2205850Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:24.2246203Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:24.2311840Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-07-17T08:41:24.2339159Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-07-17T08:41:24.2639309Z Entering 'android/libs/fbjni' 2025-07-17T08:41:24.2692126Z Entering 'third_party/FP16' 2025-07-17T08:41:24.2739526Z Entering 'third_party/FXdiv' 2025-07-17T08:41:24.2780135Z Entering 'third_party/NNPACK' 2025-07-17T08:41:24.2824834Z Entering 'third_party/NVTX' 2025-07-17T08:41:24.2871470Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:24.2913494Z Entering 'third_party/XNNPACK' 2025-07-17T08:41:24.2971751Z Entering 'third_party/aiter' 2025-07-17T08:41:24.3017917Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:24.3067826Z Entering 'third_party/benchmark' 2025-07-17T08:41:24.3109030Z Entering 'third_party/composable_kernel' 2025-07-17T08:41:24.3158834Z Entering 'third_party/cpp-httplib' 2025-07-17T08:41:24.3206204Z Entering 'third_party/cpuinfo' 2025-07-17T08:41:24.3247285Z Entering 'third_party/cudnn_frontend' 2025-07-17T08:41:24.3287276Z Entering 'third_party/cutlass' 2025-07-17T08:41:24.3334033Z Entering 'third_party/fbgemm' 2025-07-17T08:41:24.3381346Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:24.3418909Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:24.3468998Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:24.3516530Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:24.3564005Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:24.3604478Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:24.3643375Z Entering 'third_party/fbgemm/external/json' 2025-07-17T08:41:24.3691064Z Entering 'third_party/flash-attention' 2025-07-17T08:41:24.3737982Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:24.3786836Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:24.3847348Z Entering 'third_party/flatbuffers' 2025-07-17T08:41:24.3900272Z Entering 'third_party/fmt' 2025-07-17T08:41:24.3946109Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:24.3986595Z Entering 'third_party/gloo' 2025-07-17T08:41:24.4035690Z Entering 'third_party/googletest' 2025-07-17T08:41:24.4079829Z Entering 'third_party/ideep' 2025-07-17T08:41:24.4119127Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:24.4167304Z Entering 'third_party/ittapi' 2025-07-17T08:41:24.4213601Z Entering 'third_party/kineto' 2025-07-17T08:41:24.4259458Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:24.4300214Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:24.4339918Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:24.4377094Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:24.4417000Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:24.4460625Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:24.4501611Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:24.4539411Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:24.4579350Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:24.4620958Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:24.4665210Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:24.4712718Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:24.4760929Z Entering 'third_party/kleidiai' 2025-07-17T08:41:24.4806289Z Entering 'third_party/mimalloc' 2025-07-17T08:41:24.4853137Z Entering 'third_party/nlohmann' 2025-07-17T08:41:24.4900826Z Entering 'third_party/onnx' 2025-07-17T08:41:24.4961554Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:24.5012013Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T08:41:24.5061844Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:24.5102747Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:24.5142562Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:24.5183791Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:24.5228506Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:24.5267197Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:24.5305170Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:24.5353318Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:24.5400740Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:24.5449240Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:24.5510203Z Entering 'third_party/pocketfft' 2025-07-17T08:41:24.5558505Z Entering 'third_party/protobuf' 2025-07-17T08:41:24.5610455Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:24.5656070Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:24.5702516Z Entering 'third_party/psimd' 2025-07-17T08:41:24.5749141Z Entering 'third_party/pthreadpool' 2025-07-17T08:41:24.5793933Z Entering 'third_party/pybind11' 2025-07-17T08:41:24.5837050Z Entering 'third_party/python-peachpy' 2025-07-17T08:41:24.5888599Z Entering 'third_party/sleef' 2025-07-17T08:41:24.5927885Z Entering 'third_party/tensorpipe' 2025-07-17T08:41:24.5967020Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:24.6013907Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:24.6055684Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:24.6095750Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:24.6146454Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:24.6229882Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-07-17T08:41:24.6269433Z ##[endgroup] 2025-07-17T08:41:24.6270202Z ##[group]Fetching the repository 2025-07-17T08:41:24.6275659Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-07-17T08:41:25.2968003Z From https://github.com/pytorch/pytorch 2025-07-17T08:41:25.2968796Z - [deleted] (none) -> ciflow/inductor/158288 2025-07-17T08:41:25.3777886Z - [deleted] (none) -> ciflow/inductor/158291 2025-07-17T08:41:25.3778725Z - [deleted] (none) -> ciflow/inductor/158527 2025-07-17T08:41:25.3779830Z - [deleted] (none) -> ciflow/rocm/158037 2025-07-17T08:41:25.3781265Z - [deleted] (none) -> ciflow/trunk/158037 2025-07-17T08:41:25.3782820Z - [deleted] (none) -> ciflow/trunk/158290 2025-07-17T08:41:25.3784437Z - [deleted] (none) -> ciflow/trunk/158291 2025-07-17T08:41:25.3786038Z - [deleted] (none) -> ciflow/trunk/158407 2025-07-17T08:41:25.3787625Z - [deleted] (none) -> ciflow/trunk/158416 2025-07-17T08:41:25.3789190Z - [deleted] (none) -> ciflow/trunk/158453 2025-07-17T08:41:25.3790855Z - [deleted] (none) -> ciflow/trunk/158489 2025-07-17T08:41:25.3792348Z - [deleted] (none) -> ciflow/trunk/158527 2025-07-17T08:41:27.9369900Z 2050a831324..a9ab31ece73 gh/Sidharth123-cpu/43/base -> origin/gh/Sidharth123-cpu/43/base 2025-07-17T08:41:27.9378857Z ae36a219aec..bc020792a93 gh/Sidharth123-cpu/43/head -> origin/gh/Sidharth123-cpu/43/head 2025-07-17T08:41:27.9384114Z + ff80f13c50a...558aa0377ea gh/Sidharth123-cpu/43/orig -> origin/gh/Sidharth123-cpu/43/orig (forced update) 2025-07-17T08:41:27.9405324Z a0dbea65e6f..c95331cd836 gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-07-17T08:41:27.9415296Z 1efebd56387..a83ce9d5fe0 gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-07-17T08:41:27.9419322Z + 160d337cf57...aba7ddb1246 gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig (forced update) 2025-07-17T08:41:27.9426963Z f589cb4a722..8e2c6ff709f gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-07-17T08:41:27.9433710Z 3e66bd8fa8d..d05480c2360 gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-07-17T08:41:27.9440587Z + 2f4c9e4605a...3f981074efb gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig (forced update) 2025-07-17T08:41:27.9447371Z b7e61f8ef3e..018d000fae4 gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-07-17T08:41:27.9454743Z 72fa39125dc..80a7d88057d gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-07-17T08:41:27.9457696Z + b46f356160b...21a43a7bbae gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig (forced update) 2025-07-17T08:41:27.9463075Z e0941789541..7c0318f29ec gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-07-17T08:41:27.9469301Z 86295317601..870921f43d6 gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-07-17T08:41:27.9475944Z + 9a306d0fb65...68cc42ef0d8 gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig (forced update) 2025-07-17T08:41:27.9484992Z 7ca1c2847f5..dcc1adf677c gh/XuehaiPan/290/base -> origin/gh/XuehaiPan/290/base 2025-07-17T08:41:27.9491692Z 98d8e32c5ff..732eaba3a4c gh/XuehaiPan/290/head -> origin/gh/XuehaiPan/290/head 2025-07-17T08:41:27.9496059Z + 5c17d8b6786...6fb43653468 gh/XuehaiPan/290/orig -> origin/gh/XuehaiPan/290/orig (forced update) 2025-07-17T08:41:27.9507711Z e53d3c12403..1709d95d9b7 gh/XuehaiPan/328/base -> origin/gh/XuehaiPan/328/base 2025-07-17T08:41:27.9513615Z 778e4c0f6c0..5505f95fff8 gh/XuehaiPan/328/head -> origin/gh/XuehaiPan/328/head 2025-07-17T08:41:27.9519834Z + be51f9b499d...61f4c7a0b85 gh/XuehaiPan/328/orig -> origin/gh/XuehaiPan/328/orig (forced update) 2025-07-17T08:41:27.9531425Z 954df72965f..ff39e21651d gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-07-17T08:41:27.9535422Z 208110970fc..aaeb58b661f gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-07-17T08:41:27.9538542Z + 2b719d36b2b...b8b4d96046c gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig (forced update) 2025-07-17T08:41:27.9549483Z 300ccaee7d6..2dbed576805 gh/XuehaiPan/344/base -> origin/gh/XuehaiPan/344/base 2025-07-17T08:41:27.9555547Z e295d4dde43..6d2a855c305 gh/XuehaiPan/344/head -> origin/gh/XuehaiPan/344/head 2025-07-17T08:41:27.9562181Z + f935d46b495...4ab1f01cf77 gh/XuehaiPan/344/orig -> origin/gh/XuehaiPan/344/orig (forced update) 2025-07-17T08:41:27.9568247Z b7cadd195cd..c3ba852a5bb gh/XuehaiPan/345/base -> origin/gh/XuehaiPan/345/base 2025-07-17T08:41:27.9573833Z 0418cbcd8bf..73f3202632f gh/XuehaiPan/345/head -> origin/gh/XuehaiPan/345/head 2025-07-17T08:41:27.9576997Z + 8546dffb863...571fcf292cf gh/XuehaiPan/345/orig -> origin/gh/XuehaiPan/345/orig (forced update) 2025-07-17T08:41:27.9581550Z 9b227073729..bca6306c575 gh/XuehaiPan/346/base -> origin/gh/XuehaiPan/346/base 2025-07-17T08:41:27.9587338Z 643d6117256..4a16426ad35 gh/XuehaiPan/346/head -> origin/gh/XuehaiPan/346/head 2025-07-17T08:41:27.9593316Z + 89d9ad2db0b...add1f8e950a gh/XuehaiPan/346/orig -> origin/gh/XuehaiPan/346/orig (forced update) 2025-07-17T08:41:27.9599225Z 489125cb6cc..f3d99ebd3ff gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-07-17T08:41:27.9605400Z ebba3522a23..412b8264451 gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-07-17T08:41:27.9612455Z + 09401e0e782...363b8bf2b22 gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig (forced update) 2025-07-17T08:41:27.9615847Z 089ec8d7805..f6491b39f7f gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-07-17T08:41:27.9619016Z e5a3b6d6286..38a8ad83f44 gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-07-17T08:41:27.9625263Z + 2c3812408a2...8877b8ebfff gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig (forced update) 2025-07-17T08:41:27.9631820Z 2a15541b5dd..67f00597263 gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-07-17T08:41:27.9637102Z 16875b228be..55473096e2a gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-07-17T08:41:27.9642952Z + dc278149c83...21623fdef59 gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig (forced update) 2025-07-17T08:41:27.9649110Z ad1ac3ef58b..71b54e5404c gh/XuehaiPan/352/base -> origin/gh/XuehaiPan/352/base 2025-07-17T08:41:27.9654535Z 9e83151d858..1be02042962 gh/XuehaiPan/352/head -> origin/gh/XuehaiPan/352/head 2025-07-17T08:41:27.9657655Z + 82ca4b4d10b...6e103440bc9 gh/XuehaiPan/352/orig -> origin/gh/XuehaiPan/352/orig (forced update) 2025-07-17T08:41:27.9664985Z f6c8169abca..a8fb7839752 gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-07-17T08:41:27.9670714Z 845a8df6b05..0ae2a3337b7 gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-07-17T08:41:27.9676371Z + 6d02d712728...3fe78eb180a gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig (forced update) 2025-07-17T08:41:27.9682843Z * [new branch] gh/XuehaiPan/369/base -> origin/gh/XuehaiPan/369/base 2025-07-17T08:41:27.9686826Z * [new branch] gh/XuehaiPan/369/head -> origin/gh/XuehaiPan/369/head 2025-07-17T08:41:27.9691093Z * [new branch] gh/XuehaiPan/369/orig -> origin/gh/XuehaiPan/369/orig 2025-07-17T08:41:27.9779593Z * [new branch] gh/guangyey/172/base -> origin/gh/guangyey/172/base 2025-07-17T08:41:27.9782327Z * [new branch] gh/guangyey/172/head -> origin/gh/guangyey/172/head 2025-07-17T08:41:27.9785241Z * [new branch] gh/guangyey/172/orig -> origin/gh/guangyey/172/orig 2025-07-17T08:41:27.9840062Z 6f59bfa1427..17d1d9b3770 gh/lw/2/head -> origin/gh/lw/2/head 2025-07-17T08:41:27.9845441Z + e1c3489f56e...4ec6221db18 gh/lw/2/orig -> origin/gh/lw/2/orig (forced update) 2025-07-17T08:41:27.9869431Z ed051c30846..1cc0eff541c gh/oulgen/44/base -> origin/gh/oulgen/44/base 2025-07-17T08:41:27.9874127Z f5ffcbfbc36..6c2f5866ffa gh/oulgen/44/head -> origin/gh/oulgen/44/head 2025-07-17T08:41:27.9876629Z + e117f879a40...d27686412a1 gh/oulgen/44/orig -> origin/gh/oulgen/44/orig (forced update) 2025-07-17T08:41:27.9901614Z 03c36cd41b5..1e345d3441d gh/xmfan/269/head -> origin/gh/xmfan/269/head 2025-07-17T08:41:27.9903627Z + e916252ea67...e2d689d3db5 gh/xmfan/269/orig -> origin/gh/xmfan/269/orig (forced update) 2025-07-17T08:41:27.9910689Z 0d481c36592..7770f763058 gh/ydwu4/282/head -> origin/gh/ydwu4/282/head 2025-07-17T08:41:27.9913310Z + 2dbbbcba273...d82de37a343 gh/ydwu4/282/orig -> origin/gh/ydwu4/282/orig (forced update) 2025-07-17T08:41:27.9924657Z 79d7c754ab8..39ac189808c main -> origin/main 2025-07-17T08:41:27.9933337Z + cb80bda2db4...d0b8cc6b79c mlazos/hop-modes -> origin/mlazos/hop-modes (forced update) 2025-07-17T08:41:27.9937151Z + 3c611c95e3a...228824d2cbe mlazos/nested-dc -> origin/mlazos/nested-dc (forced update) 2025-07-17T08:41:27.9942042Z ab43fe4bdf5..9e5df57ebb4 nightly -> origin/nightly 2025-07-17T08:41:27.9955766Z + 3133820e94e...67b6d88e85c update_submodule_FBGEMM -> origin/update_submodule_FBGEMM (forced update) 2025-07-17T08:41:27.9957984Z + a8647ccd0db...aa6f3f622b4 update_submodule_kineto -> origin/update_submodule_kineto (forced update) 2025-07-17T08:41:27.9960423Z 82a1ee1135b..2ad5c25cfc6 viable/strict -> origin/viable/strict 2025-07-17T08:41:27.9964844Z t [tag update] ciflow/binaries/156049 -> ciflow/binaries/156049 2025-07-17T08:41:27.9965951Z t [tag update] ciflow/binaries/157432 -> ciflow/binaries/157432 2025-07-17T08:41:27.9967180Z t [tag update] ciflow/binaries/158104 -> ciflow/binaries/158104 2025-07-17T08:41:27.9968860Z t [tag update] ciflow/binaries_libtorch/156049 -> ciflow/binaries_libtorch/156049 2025-07-17T08:41:27.9970620Z t [tag update] ciflow/binaries_libtorch/157432 -> ciflow/binaries_libtorch/157432 2025-07-17T08:41:27.9971966Z t [tag update] ciflow/binaries_wheel/156049 -> ciflow/binaries_wheel/156049 2025-07-17T08:41:27.9973065Z t [tag update] ciflow/binaries_wheel/157432 -> ciflow/binaries_wheel/157432 2025-07-17T08:41:27.9974399Z t [tag update] ciflow/h100-distributed/156605 -> ciflow/h100-distributed/156605 2025-07-17T08:41:27.9975836Z t [tag update] ciflow/h100-distributed/156703 -> ciflow/h100-distributed/156703 2025-07-17T08:41:27.9978427Z t [tag update] ciflow/inductor/137400 -> ciflow/inductor/137400 2025-07-17T08:41:27.9979836Z t [tag update] ciflow/inductor/148180 -> ciflow/inductor/148180 2025-07-17T08:41:27.9981303Z t [tag update] ciflow/inductor/148328 -> ciflow/inductor/148328 2025-07-17T08:41:27.9983217Z t [tag update] ciflow/inductor/152624 -> ciflow/inductor/152624 2025-07-17T08:41:27.9986924Z t [tag update] ciflow/inductor/155452 -> ciflow/inductor/155452 2025-07-17T08:41:27.9988404Z t [tag update] ciflow/inductor/156049 -> ciflow/inductor/156049 2025-07-17T08:41:27.9989711Z t [tag update] ciflow/inductor/156605 -> ciflow/inductor/156605 2025-07-17T08:41:27.9991459Z t [tag update] ciflow/inductor/157635 -> ciflow/inductor/157635 2025-07-17T08:41:27.9993563Z t [tag update] ciflow/inductor/157993 -> ciflow/inductor/157993 2025-07-17T08:41:27.9995041Z t [tag update] ciflow/inductor/158072 -> ciflow/inductor/158072 2025-07-17T08:41:27.9996511Z t [tag update] ciflow/inductor/158104 -> ciflow/inductor/158104 2025-07-17T08:41:28.0044525Z t [tag update] ciflow/inductor/158430 -> ciflow/inductor/158430 2025-07-17T08:41:28.0045438Z t [tag update] ciflow/inductor/158460 -> ciflow/inductor/158460 2025-07-17T08:41:28.0046303Z t [tag update] ciflow/inductor/158480 -> ciflow/inductor/158480 2025-07-17T08:41:28.0047113Z t [tag update] ciflow/inductor/158520 -> ciflow/inductor/158520 2025-07-17T08:41:28.0047874Z t [tag update] ciflow/inductor/158535 -> ciflow/inductor/158535 2025-07-17T08:41:28.0048601Z * [new tag] ciflow/inductor/158543 -> ciflow/inductor/158543 2025-07-17T08:41:28.0049401Z t [tag update] ciflow/mps/157553 -> ciflow/mps/157553 2025-07-17T08:41:28.0050153Z t [tag update] ciflow/trunk/137400 -> ciflow/trunk/137400 2025-07-17T08:41:28.0050880Z t [tag update] ciflow/trunk/148180 -> ciflow/trunk/148180 2025-07-17T08:41:28.0051606Z t [tag update] ciflow/trunk/148328 -> ciflow/trunk/148328 2025-07-17T08:41:28.0052324Z t [tag update] ciflow/trunk/152624 -> ciflow/trunk/152624 2025-07-17T08:41:28.0053078Z t [tag update] ciflow/trunk/156049 -> ciflow/trunk/156049 2025-07-17T08:41:28.0053808Z t [tag update] ciflow/trunk/156605 -> ciflow/trunk/156605 2025-07-17T08:41:28.0054522Z t [tag update] ciflow/trunk/157432 -> ciflow/trunk/157432 2025-07-17T08:41:28.0055240Z t [tag update] ciflow/trunk/157550 -> ciflow/trunk/157550 2025-07-17T08:41:28.0055948Z t [tag update] ciflow/trunk/157552 -> ciflow/trunk/157552 2025-07-17T08:41:28.0056661Z t [tag update] ciflow/trunk/158072 -> ciflow/trunk/158072 2025-07-17T08:41:28.0057368Z t [tag update] ciflow/trunk/158104 -> ciflow/trunk/158104 2025-07-17T08:41:28.0058094Z t [tag update] ciflow/trunk/158430 -> ciflow/trunk/158430 2025-07-17T08:41:28.0058790Z * [new tag] ciflow/trunk/158541 -> ciflow/trunk/158541 2025-07-17T08:41:28.0059450Z * [new tag] ciflow/trunk/158543 -> ciflow/trunk/158543 2025-07-17T08:41:28.0060154Z t [tag update] ciflow/xpu/158533 -> ciflow/xpu/158533 2025-07-17T08:41:28.0061332Z * [new tag] ciflow/xpu/158542 -> ciflow/xpu/158542 2025-07-17T08:41:28.0062467Z * [new tag] trunk/04349f9ee541c7d07cc057bbe739f46bd4c30dcc -> trunk/04349f9ee541c7d07cc057bbe739f46bd4c30dcc 2025-07-17T08:41:28.0063906Z * [new tag] trunk/09db3a22e8783c4841697317688ba9467c7cc457 -> trunk/09db3a22e8783c4841697317688ba9467c7cc457 2025-07-17T08:41:28.0065696Z * [new tag] trunk/39ac189808c61588f3594dbc2fc1d69bb6194c47 -> trunk/39ac189808c61588f3594dbc2fc1d69bb6194c47 2025-07-17T08:41:28.0105189Z * [new tag] trunk/9636e2cfd3e995ef977f670ad47e8e895296d992 -> trunk/9636e2cfd3e995ef977f670ad47e8e895296d992 2025-07-17T08:41:28.0110051Z * [new tag] trunk/9f37cce69334bccebf4b21503f0047d0c0bb320c -> trunk/9f37cce69334bccebf4b21503f0047d0c0bb320c 2025-07-17T08:41:28.0112718Z * [new tag] trunk/a38f433be2e94a64b095a44ba39879d02d0c2316 -> trunk/a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:41:28.0134966Z * [new tag] trunk/d76323d41742cbc05ec6857319b267d2c7ea8fd9 -> trunk/d76323d41742cbc05ec6857319b267d2c7ea8fd9 2025-07-17T08:41:28.0136871Z * [new tag] trunk/d9426a81d2ab54f809a3b32a6ab2e606073fe66f -> trunk/d9426a81d2ab54f809a3b32a6ab2e606073fe66f 2025-07-17T08:41:28.1293443Z [command]/usr/bin/git rev-parse --verify --quiet a38f433be2e94a64b095a44ba39879d02d0c2316^{object} 2025-07-17T08:41:28.1382745Z a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:41:28.1389526Z ##[endgroup] 2025-07-17T08:41:28.1390232Z ##[group]Determining the checkout info 2025-07-17T08:41:28.1391482Z ##[endgroup] 2025-07-17T08:41:28.1400466Z [command]/usr/bin/git sparse-checkout disable 2025-07-17T08:41:28.1611087Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-07-17T08:41:28.1643054Z ##[group]Checking out the ref 2025-07-17T08:41:28.1650253Z [command]/usr/bin/git checkout --progress --force a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:41:28.4501187Z Previous HEAD position was f6d138807f1 Always disable ShardingPropagation cache if compiling (#156868) 2025-07-17T08:41:28.4512394Z HEAD is now at a38f433be2e [Docker builds] Move from Miniconda to Miniforge (#158370) 2025-07-17T08:41:28.4642097Z ##[endgroup] 2025-07-17T08:41:28.4642855Z ##[group]Setting up auth for fetching submodules 2025-07-17T08:41:28.4653166Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-07-17T08:41:28.4721080Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-07-17T08:41:28.4773422Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-07-17T08:41:28.4820421Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-07-17T08:41:28.4863876Z ##[endgroup] 2025-07-17T08:41:28.4864565Z ##[group]Fetching submodules 2025-07-17T08:41:28.4870881Z [command]/usr/bin/git submodule sync --recursive 2025-07-17T08:41:28.5168259Z Synchronizing submodule url for 'android/libs/fbjni' 2025-07-17T08:41:28.5212293Z Synchronizing submodule url for 'third_party/FP16' 2025-07-17T08:41:28.5263094Z Synchronizing submodule url for 'third_party/FXdiv' 2025-07-17T08:41:28.5311646Z Synchronizing submodule url for 'third_party/NNPACK' 2025-07-17T08:41:28.5349603Z Synchronizing submodule url for 'third_party/NVTX' 2025-07-17T08:41:28.5390493Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:28.5430513Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-07-17T08:41:28.5484104Z Synchronizing submodule url for 'third_party/aiter' 2025-07-17T08:41:28.5526170Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:28.5574678Z Synchronizing submodule url for 'third_party/benchmark' 2025-07-17T08:41:28.5615921Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-07-17T08:41:28.5667149Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-07-17T08:41:28.5709962Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-07-17T08:41:28.5749890Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-07-17T08:41:28.5793477Z Synchronizing submodule url for 'third_party/cutlass' 2025-07-17T08:41:28.5850521Z Synchronizing submodule url for 'third_party/fbgemm' 2025-07-17T08:41:28.5891772Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:28.5939170Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:28.5985464Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:28.6028717Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:28.6075555Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:28.6114230Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:28.6150255Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-07-17T08:41:28.6193816Z Synchronizing submodule url for 'third_party/flash-attention' 2025-07-17T08:41:28.6230072Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:28.6279795Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:28.6334701Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-07-17T08:41:28.6380416Z Synchronizing submodule url for 'third_party/fmt' 2025-07-17T08:41:28.6415243Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:28.6452094Z Synchronizing submodule url for 'third_party/gloo' 2025-07-17T08:41:28.6502109Z Synchronizing submodule url for 'third_party/googletest' 2025-07-17T08:41:28.6541409Z Synchronizing submodule url for 'third_party/ideep' 2025-07-17T08:41:28.6573065Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:28.6616607Z Synchronizing submodule url for 'third_party/ittapi' 2025-07-17T08:41:28.6668079Z Synchronizing submodule url for 'third_party/kineto' 2025-07-17T08:41:28.6706907Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:28.6741872Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:28.6784184Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:28.6824929Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:28.6866658Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:28.6897593Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:28.6938929Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:28.6977698Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:28.7011400Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:28.7052265Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:28.7094814Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:28.7132928Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:28.7173973Z Synchronizing submodule url for 'third_party/kleidiai' 2025-07-17T08:41:28.7218801Z Synchronizing submodule url for 'third_party/mimalloc' 2025-07-17T08:41:28.7256844Z Synchronizing submodule url for 'third_party/nlohmann' 2025-07-17T08:41:28.7294824Z Synchronizing submodule url for 'third_party/onnx' 2025-07-17T08:41:28.7354880Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:28.7411025Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-07-17T08:41:28.7459256Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:28.7500298Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:28.7536985Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:28.7570704Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:28.7606334Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:28.7648131Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:28.7689805Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:28.7730293Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:28.7773958Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:28.7821053Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:28.7898957Z Synchronizing submodule url for 'third_party/pocketfft' 2025-07-17T08:41:28.7938892Z Synchronizing submodule url for 'third_party/protobuf' 2025-07-17T08:41:28.7989627Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:28.8039243Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:28.8081155Z Synchronizing submodule url for 'third_party/psimd' 2025-07-17T08:41:28.8128835Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-07-17T08:41:28.8167700Z Synchronizing submodule url for 'third_party/pybind11' 2025-07-17T08:41:28.8205249Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-07-17T08:41:28.8245162Z Synchronizing submodule url for 'third_party/sleef' 2025-07-17T08:41:28.8293814Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-07-17T08:41:28.8329598Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:28.8376243Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:28.8419682Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:28.8460450Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:28.8499428Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:28.8566247Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-07-17T08:41:28.9118487Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-07-17T08:41:28.9390134Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-07-17T08:41:28.9659722Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-07-17T08:41:28.9925040Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-07-17T08:41:29.0211825Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-07-17T08:41:29.0486550Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-07-17T08:41:29.0919879Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-07-17T08:41:29.1280424Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-07-17T08:41:29.1735898Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-07-17T08:41:29.2059434Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-07-17T08:41:29.2527006Z Submodule path 'third_party/composable_kernel': checked out '434d19f696da62c12b5372b32cbc9ba968588d7e' 2025-07-17T08:41:29.2875111Z Submodule path 'third_party/cpp-httplib': checked out '3af7f2c16147f3fbc6e4d717032daf505dc1652c' 2025-07-17T08:41:29.3166537Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-07-17T08:41:29.3472175Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-07-17T08:41:29.3828848Z Submodule path 'third_party/cutlass': checked out 'b995f933179c22d3fe0d871c3a53d11e4681950f' 2025-07-17T08:41:29.4217798Z Submodule path 'third_party/fbgemm': checked out '157e88b750c452bef2ab4653fe9d1eeb151ce4c3' 2025-07-17T08:41:29.4480466Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'e5d7c0bd5d9aec44d68830187138149e6a8c4e32' 2025-07-17T08:41:29.4818376Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '4a61bdd4bd4ed730e078aebc7c0fcf046ff29406' 2025-07-17T08:41:29.5112147Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-07-17T08:41:29.5426521Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '3ed8d2ec4ba35ef5d9d8353826209b6f868f63d3' 2025-07-17T08:41:29.5707320Z Submodule path 'third_party/fbgemm/external/googletest': checked out 'f8d7d77c06936315286eb55f8de22cd23c188571' 2025-07-17T08:41:29.5927707Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out 'a4337c69fe0e2552a7b7b0669178926beeed828c' 2025-07-17T08:41:29.6221301Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-07-17T08:41:29.6539184Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-07-17T08:41:29.6928332Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-07-17T08:41:29.7280237Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-07-17T08:41:29.7606369Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-07-17T08:41:29.7869723Z Submodule path 'third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-07-17T08:41:29.8132862Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-07-17T08:41:29.8413234Z Submodule path 'third_party/gloo': checked out 'c7b7b022c124d9643957d9bd55f57ac59fce8fa2' 2025-07-17T08:41:29.8678567Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-07-17T08:41:29.8947179Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-07-17T08:41:29.9378135Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-07-17T08:41:29.9686223Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-07-17T08:41:29.9997255Z Submodule path 'third_party/kineto': checked out '5e7501833f1021ce6f618572d3baf657b6319658' 2025-07-17T08:41:30.0238928Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2025-07-17T08:41:30.0499622Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-07-17T08:41:30.0749791Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-07-17T08:41:30.0978831Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-07-17T08:41:30.1220862Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-07-17T08:41:30.1461145Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-07-17T08:41:30.1708331Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-07-17T08:41:30.1942447Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2025-07-17T08:41:30.2241774Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-07-17T08:41:30.2495565Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-07-17T08:41:30.2774553Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2025-07-17T08:41:30.3030264Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2025-07-17T08:41:30.3336270Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-07-17T08:41:30.3676927Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-07-17T08:41:30.4062167Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-07-17T08:41:30.4499613Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-07-17T08:41:30.4797127Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-07-17T08:41:30.5173037Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-07-17T08:41:30.5429194Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-07-17T08:41:30.5687730Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-07-17T08:41:30.5948652Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-07-17T08:41:30.6255818Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-07-17T08:41:30.6529563Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-07-17T08:41:30.6771064Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-07-17T08:41:30.7016798Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-07-17T08:41:30.7300267Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-07-17T08:41:30.7578864Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-07-17T08:41:30.7992253Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-07-17T08:41:30.8343729Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-07-17T08:41:30.8819580Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-07-17T08:41:30.9053396Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-07-17T08:41:30.9314882Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-07-17T08:41:30.9614196Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-07-17T08:41:30.9869597Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-07-17T08:41:31.0175734Z Submodule path 'third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-07-17T08:41:31.0429124Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-07-17T08:41:31.0742524Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-07-17T08:41:31.1050298Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2025-07-17T08:41:31.1288651Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-07-17T08:41:31.1555700Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-07-17T08:41:31.1957611Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2025-07-17T08:41:31.2240094Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-07-17T08:41:31.2459212Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-07-17T08:41:31.2577564Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-07-17T08:41:31.2844763Z Entering 'android/libs/fbjni' 2025-07-17T08:41:31.2887998Z Entering 'third_party/FP16' 2025-07-17T08:41:31.2924491Z Entering 'third_party/FXdiv' 2025-07-17T08:41:31.2960235Z Entering 'third_party/NNPACK' 2025-07-17T08:41:31.2996268Z Entering 'third_party/NVTX' 2025-07-17T08:41:31.3034298Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:31.3070160Z Entering 'third_party/XNNPACK' 2025-07-17T08:41:31.3119223Z Entering 'third_party/aiter' 2025-07-17T08:41:31.3158077Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:31.3207387Z Entering 'third_party/benchmark' 2025-07-17T08:41:31.3244744Z Entering 'third_party/composable_kernel' 2025-07-17T08:41:31.3287074Z Entering 'third_party/cpp-httplib' 2025-07-17T08:41:31.3323662Z Entering 'third_party/cpuinfo' 2025-07-17T08:41:31.3359894Z Entering 'third_party/cudnn_frontend' 2025-07-17T08:41:31.3397511Z Entering 'third_party/cutlass' 2025-07-17T08:41:31.3444386Z Entering 'third_party/fbgemm' 2025-07-17T08:41:31.3483539Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:31.3526976Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:31.3567323Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:31.3604215Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:31.3647025Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:31.3685252Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:31.3725281Z Entering 'third_party/fbgemm/external/json' 2025-07-17T08:41:31.3769843Z Entering 'third_party/flash-attention' 2025-07-17T08:41:31.3807243Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:31.3852659Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:31.3896543Z Entering 'third_party/flatbuffers' 2025-07-17T08:41:31.3939168Z Entering 'third_party/fmt' 2025-07-17T08:41:31.3981253Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:31.4019464Z Entering 'third_party/gloo' 2025-07-17T08:41:31.4058180Z Entering 'third_party/googletest' 2025-07-17T08:41:31.4100847Z Entering 'third_party/ideep' 2025-07-17T08:41:31.4139266Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:31.4187071Z Entering 'third_party/ittapi' 2025-07-17T08:41:31.4224945Z Entering 'third_party/kineto' 2025-07-17T08:41:31.4263363Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:31.4304719Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:31.4346336Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:31.4387798Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:31.4424552Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:31.4461922Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:31.4503994Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:31.4543250Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:31.4576811Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:31.4619178Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:31.4660147Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:31.4698788Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:31.4740582Z Entering 'third_party/kleidiai' 2025-07-17T08:41:31.4781098Z Entering 'third_party/mimalloc' 2025-07-17T08:41:31.4820325Z Entering 'third_party/nlohmann' 2025-07-17T08:41:31.4861122Z Entering 'third_party/onnx' 2025-07-17T08:41:31.4914588Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:31.4961420Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T08:41:31.5015207Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:31.5059387Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:31.5102107Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:31.5137785Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:31.5172207Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:31.5219278Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:31.5257664Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:31.5291112Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:31.5345235Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:31.5387793Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:31.5445160Z Entering 'third_party/pocketfft' 2025-07-17T08:41:31.5482936Z Entering 'third_party/protobuf' 2025-07-17T08:41:31.5523280Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:31.5566664Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:31.5609079Z Entering 'third_party/psimd' 2025-07-17T08:41:31.5648343Z Entering 'third_party/pthreadpool' 2025-07-17T08:41:31.5688626Z Entering 'third_party/pybind11' 2025-07-17T08:41:31.5728860Z Entering 'third_party/python-peachpy' 2025-07-17T08:41:31.5770792Z Entering 'third_party/sleef' 2025-07-17T08:41:31.5806715Z Entering 'third_party/tensorpipe' 2025-07-17T08:41:31.5843118Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:31.5886418Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:31.5920396Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:31.5958085Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:31.5993919Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:31.6069971Z ##[endgroup] 2025-07-17T08:41:31.6070725Z ##[group]Persisting credentials for submodules 2025-07-17T08:41:31.6075682Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-07-17T08:41:31.6342547Z Entering 'android/libs/fbjni' 2025-07-17T08:41:31.6368848Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6369509Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6408397Z Entering 'third_party/FP16' 2025-07-17T08:41:31.6431012Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6431719Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6466276Z Entering 'third_party/FXdiv' 2025-07-17T08:41:31.6493661Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6494314Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6533511Z Entering 'third_party/NNPACK' 2025-07-17T08:41:31.6566408Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6567066Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6593333Z Entering 'third_party/NVTX' 2025-07-17T08:41:31.6618660Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6619324Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6650818Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:31.6672338Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6672986Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6706621Z Entering 'third_party/XNNPACK' 2025-07-17T08:41:31.6731528Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6732219Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6782100Z Entering 'third_party/aiter' 2025-07-17T08:41:31.6807043Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6807718Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6846997Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:31.6867736Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6868397Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6915052Z Entering 'third_party/benchmark' 2025-07-17T08:41:31.6939354Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6939994Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6970112Z Entering 'third_party/composable_kernel' 2025-07-17T08:41:31.6991559Z url.https://github.com/.insteadof 2025-07-17T08:41:31.6992210Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7028774Z Entering 'third_party/cpp-httplib' 2025-07-17T08:41:31.7056188Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7056837Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7088215Z Entering 'third_party/cpuinfo' 2025-07-17T08:41:31.7114987Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7115642Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7149901Z Entering 'third_party/cudnn_frontend' 2025-07-17T08:41:31.7177596Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7178250Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7212750Z Entering 'third_party/cutlass' 2025-07-17T08:41:31.7234487Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7235141Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7278299Z Entering 'third_party/fbgemm' 2025-07-17T08:41:31.7300404Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7301083Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7332447Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:31.7360726Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7361375Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7398336Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:31.7421271Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7421951Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7465845Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:31.7491402Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7492066Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7523821Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:31.7550060Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7550718Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7587484Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:31.7609265Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7609898Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7640302Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:31.7662767Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7663415Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7699561Z Entering 'third_party/fbgemm/external/json' 2025-07-17T08:41:31.7718547Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7719197Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7759238Z Entering 'third_party/flash-attention' 2025-07-17T08:41:31.7784329Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7784861Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7814803Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:31.7841754Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7842398Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7875826Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:31.7897282Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7897859Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7948352Z Entering 'third_party/flatbuffers' 2025-07-17T08:41:31.7971234Z url.https://github.com/.insteadof 2025-07-17T08:41:31.7971898Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8013851Z Entering 'third_party/fmt' 2025-07-17T08:41:31.8050176Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8051322Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8086235Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:31.8108044Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8108691Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8138200Z Entering 'third_party/gloo' 2025-07-17T08:41:31.8162534Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8163118Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8204864Z Entering 'third_party/googletest' 2025-07-17T08:41:31.8227888Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8228544Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8264845Z Entering 'third_party/ideep' 2025-07-17T08:41:31.8293947Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8294609Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8324723Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:31.8345454Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8346015Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8387696Z Entering 'third_party/ittapi' 2025-07-17T08:41:31.8412323Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8412850Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8463626Z Entering 'third_party/kineto' 2025-07-17T08:41:31.8495079Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8495593Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8530680Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:31.8552279Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8553201Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8586852Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:31.8611821Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8612488Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8648442Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:31.8668171Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8668854Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8702442Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:31.8721976Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8722650Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8761031Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:31.8816710Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8817379Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8818224Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:31.8839350Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8840281Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8890826Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:31.8912488Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8913144Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8948331Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:31.8968195Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8968881Z url.https://github.com/.insteadof 2025-07-17T08:41:31.8998804Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:31.9019006Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9019653Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9049586Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:31.9069849Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9070509Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9106596Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:31.9133379Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9134052Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9169236Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:31.9189479Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9190166Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9222190Z Entering 'third_party/kleidiai' 2025-07-17T08:41:31.9253530Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9254219Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9286377Z Entering 'third_party/mimalloc' 2025-07-17T08:41:31.9311445Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9312091Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9345673Z Entering 'third_party/nlohmann' 2025-07-17T08:41:31.9375122Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9375779Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9408234Z Entering 'third_party/onnx' 2025-07-17T08:41:31.9430486Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9431140Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9476423Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:31.9503485Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9504149Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9543009Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T08:41:31.9566073Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9566721Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9597664Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:31.9620039Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9620698Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9655891Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:31.9676785Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9677456Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9712107Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:31.9732320Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9732974Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9762866Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:31.9782902Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9783565Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9817931Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:31.9842923Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9843573Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9880878Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:31.9901777Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9902367Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9931312Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:31.9950913Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9951463Z url.https://github.com/.insteadof 2025-07-17T08:41:31.9985811Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:32.0012105Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0012656Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0046252Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:32.0066595Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0067137Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0104965Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:32.0129425Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0129940Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0180943Z Entering 'third_party/pocketfft' 2025-07-17T08:41:32.0211436Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0211962Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0257303Z Entering 'third_party/protobuf' 2025-07-17T08:41:32.0284451Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0285133Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0328430Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:32.0357892Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0358548Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0396266Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:32.0419939Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0421066Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0460368Z Entering 'third_party/psimd' 2025-07-17T08:41:32.0484794Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0485465Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0517733Z Entering 'third_party/pthreadpool' 2025-07-17T08:41:32.0544502Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0545156Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0574863Z Entering 'third_party/pybind11' 2025-07-17T08:41:32.0599046Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0599836Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0635206Z Entering 'third_party/python-peachpy' 2025-07-17T08:41:32.0659648Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0660296Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0689669Z Entering 'third_party/sleef' 2025-07-17T08:41:32.0711870Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0712458Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0744104Z Entering 'third_party/tensorpipe' 2025-07-17T08:41:32.0767516Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0768027Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0800507Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:32.0825574Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0826219Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0859268Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:32.0883458Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0884029Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0915324Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:32.0942673Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0943202Z url.https://github.com/.insteadof 2025-07-17T08:41:32.0979455Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:32.1013774Z url.https://github.com/.insteadof 2025-07-17T08:41:32.1014392Z url.https://github.com/.insteadof 2025-07-17T08:41:32.1044075Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:32.1066274Z url.https://github.com/.insteadof 2025-07-17T08:41:32.1066827Z url.https://github.com/.insteadof 2025-07-17T08:41:32.1128322Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-07-17T08:41:32.1427809Z Entering 'android/libs/fbjni' 2025-07-17T08:41:32.1489828Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-07-17T08:41:32.1510116Z Entering 'third_party/FP16' 2025-07-17T08:41:32.1545656Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-07-17T08:41:32.1568840Z Entering 'third_party/FXdiv' 2025-07-17T08:41:32.1605706Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-07-17T08:41:32.1623481Z Entering 'third_party/NNPACK' 2025-07-17T08:41:32.1659532Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-07-17T08:41:32.1678045Z Entering 'third_party/NVTX' 2025-07-17T08:41:32.1717442Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-07-17T08:41:32.1738671Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:32.1777212Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-07-17T08:41:32.1808290Z Entering 'third_party/XNNPACK' 2025-07-17T08:41:32.1845150Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-07-17T08:41:32.1877648Z Entering 'third_party/aiter' 2025-07-17T08:41:32.1913666Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-07-17T08:41:32.1933037Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:32.1976688Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-07-17T08:41:32.2006856Z Entering 'third_party/benchmark' 2025-07-17T08:41:32.2047129Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-07-17T08:41:32.2066496Z Entering 'third_party/composable_kernel' 2025-07-17T08:41:32.2104484Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-07-17T08:41:32.2134657Z Entering 'third_party/cpp-httplib' 2025-07-17T08:41:32.2168842Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-07-17T08:41:32.2187175Z Entering 'third_party/cpuinfo' 2025-07-17T08:41:32.2221617Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-07-17T08:41:32.2240922Z Entering 'third_party/cudnn_frontend' 2025-07-17T08:41:32.2280623Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-07-17T08:41:32.2300718Z Entering 'third_party/cutlass' 2025-07-17T08:41:32.2336783Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-07-17T08:41:32.2363644Z Entering 'third_party/fbgemm' 2025-07-17T08:41:32.2422595Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-07-17T08:41:32.2448084Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:32.2488009Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-07-17T08:41:32.2505962Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:32.2544471Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-07-17T08:41:32.2572524Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:32.2613986Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-07-17T08:41:32.2631540Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:32.2665430Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-07-17T08:41:32.2695179Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:32.2735902Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-07-17T08:41:32.2757218Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:32.2796781Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-07-17T08:41:32.2817762Z Entering 'third_party/fbgemm/external/json' 2025-07-17T08:41:32.2858071Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-07-17T08:41:32.2881184Z Entering 'third_party/flash-attention' 2025-07-17T08:41:32.2918485Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-07-17T08:41:32.2938797Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:32.2976739Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-07-17T08:41:32.2999642Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:32.3044059Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-07-17T08:41:32.3072017Z Entering 'third_party/flatbuffers' 2025-07-17T08:41:32.3122425Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-07-17T08:41:32.3144874Z Entering 'third_party/fmt' 2025-07-17T08:41:32.3192212Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-07-17T08:41:32.3216373Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:32.3253590Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-07-17T08:41:32.3271012Z Entering 'third_party/gloo' 2025-07-17T08:41:32.3305020Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-07-17T08:41:32.3324777Z Entering 'third_party/googletest' 2025-07-17T08:41:32.3359547Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-07-17T08:41:32.3378959Z Entering 'third_party/ideep' 2025-07-17T08:41:32.3417221Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-07-17T08:41:32.3435618Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:32.3482324Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-07-17T08:41:32.3511634Z Entering 'third_party/ittapi' 2025-07-17T08:41:32.3558645Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-07-17T08:41:32.3582717Z Entering 'third_party/kineto' 2025-07-17T08:41:32.3624872Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-07-17T08:41:32.3647888Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:32.3686097Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-07-17T08:41:32.3703048Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:32.3745098Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-07-17T08:41:32.3765130Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:32.3802296Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-07-17T08:41:32.3820461Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:32.3856699Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-07-17T08:41:32.3879065Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:32.3920799Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-07-17T08:41:32.3936934Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:32.3977430Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-07-17T08:41:32.4001665Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:32.4051151Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-07-17T08:41:32.4069148Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:32.4103052Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-07-17T08:41:32.4120781Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:32.4162257Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-07-17T08:41:32.4183096Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:32.4215560Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-07-17T08:41:32.4235594Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:32.4285121Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-07-17T08:41:32.4304938Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:32.4341413Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-07-17T08:41:32.4361598Z Entering 'third_party/kleidiai' 2025-07-17T08:41:32.4399229Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-07-17T08:41:32.4419670Z Entering 'third_party/mimalloc' 2025-07-17T08:41:32.4459463Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-07-17T08:41:32.4484132Z Entering 'third_party/nlohmann' 2025-07-17T08:41:32.4523414Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-07-17T08:41:32.4543328Z Entering 'third_party/onnx' 2025-07-17T08:41:32.4582771Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-07-17T08:41:32.4620706Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:32.4660525Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-07-17T08:41:32.4683144Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T08:41:32.4728336Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-07-17T08:41:32.4748654Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:32.4793882Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-07-17T08:41:32.4812278Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:32.4861037Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-07-17T08:41:32.4877900Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:32.4918484Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-07-17T08:41:32.4940846Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:32.4977925Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-07-17T08:41:32.5000690Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:32.5037756Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-07-17T08:41:32.5060321Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:32.5100469Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-07-17T08:41:32.5129081Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:32.5169139Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-07-17T08:41:32.5186213Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:32.5222094Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-07-17T08:41:32.5241286Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:32.5275472Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-07-17T08:41:32.5295238Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:32.5336409Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-07-17T08:41:32.5380515Z Entering 'third_party/pocketfft' 2025-07-17T08:41:32.5423184Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-07-17T08:41:32.5446369Z Entering 'third_party/protobuf' 2025-07-17T08:41:32.5482395Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-07-17T08:41:32.5505678Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:32.5543472Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-07-17T08:41:32.5562489Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:32.5599288Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-07-17T08:41:32.5624965Z Entering 'third_party/psimd' 2025-07-17T08:41:32.5669880Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-07-17T08:41:32.5692677Z Entering 'third_party/pthreadpool' 2025-07-17T08:41:32.5734475Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-07-17T08:41:32.5756224Z Entering 'third_party/pybind11' 2025-07-17T08:41:32.5804802Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-07-17T08:41:32.5824862Z Entering 'third_party/python-peachpy' 2025-07-17T08:41:32.5862962Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-07-17T08:41:32.5882467Z Entering 'third_party/sleef' 2025-07-17T08:41:32.5932950Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-07-17T08:41:32.5951865Z Entering 'third_party/tensorpipe' 2025-07-17T08:41:32.5985007Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-07-17T08:41:32.6004778Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:32.6043988Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-07-17T08:41:32.6062567Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:32.6097990Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-07-17T08:41:32.6116101Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:32.6152407Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-07-17T08:41:32.6169915Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:32.6202134Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-07-17T08:41:32.6218200Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:32.6254517Z file:/home/pytorchci/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-07-17T08:41:32.6516519Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-07-17T08:41:32.6794858Z Entering 'android/libs/fbjni' 2025-07-17T08:41:32.6840221Z Entering 'third_party/FP16' 2025-07-17T08:41:32.6876690Z Entering 'third_party/FXdiv' 2025-07-17T08:41:32.6914383Z Entering 'third_party/NNPACK' 2025-07-17T08:41:32.6949922Z Entering 'third_party/NVTX' 2025-07-17T08:41:32.6986236Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:32.7023546Z Entering 'third_party/XNNPACK' 2025-07-17T08:41:32.7085013Z Entering 'third_party/aiter' 2025-07-17T08:41:32.7142005Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:32.7185634Z Entering 'third_party/benchmark' 2025-07-17T08:41:32.7226006Z Entering 'third_party/composable_kernel' 2025-07-17T08:41:32.7272191Z Entering 'third_party/cpp-httplib' 2025-07-17T08:41:32.7311698Z Entering 'third_party/cpuinfo' 2025-07-17T08:41:32.7350377Z Entering 'third_party/cudnn_frontend' 2025-07-17T08:41:32.7387515Z Entering 'third_party/cutlass' 2025-07-17T08:41:32.7438826Z Entering 'third_party/fbgemm' 2025-07-17T08:41:32.7485690Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:32.7519733Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:32.7558908Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:32.7595726Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:32.7637650Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:32.7674057Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:32.7714352Z Entering 'third_party/fbgemm/external/json' 2025-07-17T08:41:32.7764239Z Entering 'third_party/flash-attention' 2025-07-17T08:41:32.7805525Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:32.7844272Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:32.7888375Z Entering 'third_party/flatbuffers' 2025-07-17T08:41:32.7929677Z Entering 'third_party/fmt' 2025-07-17T08:41:32.7968001Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:32.8008585Z Entering 'third_party/gloo' 2025-07-17T08:41:32.8048087Z Entering 'third_party/googletest' 2025-07-17T08:41:32.8089106Z Entering 'third_party/ideep' 2025-07-17T08:41:32.8129957Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:32.8178880Z Entering 'third_party/ittapi' 2025-07-17T08:41:32.8223662Z Entering 'third_party/kineto' 2025-07-17T08:41:32.8265949Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:32.8307299Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:32.8347266Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:32.8382312Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:32.8416410Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:32.8452350Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:32.8492435Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:32.8532220Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:32.8566699Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:32.8603940Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:32.8641649Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:32.8679220Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:32.8727459Z Entering 'third_party/kleidiai' 2025-07-17T08:41:32.8774228Z Entering 'third_party/mimalloc' 2025-07-17T08:41:32.8811928Z Entering 'third_party/nlohmann' 2025-07-17T08:41:32.8852241Z Entering 'third_party/onnx' 2025-07-17T08:41:32.8902569Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:32.8948779Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T08:41:32.8996323Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:32.9037127Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:32.9072781Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:32.9114502Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:32.9168309Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:32.9202863Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:32.9240282Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:32.9278719Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:32.9318122Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:32.9359306Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:32.9420213Z Entering 'third_party/pocketfft' 2025-07-17T08:41:32.9458371Z Entering 'third_party/protobuf' 2025-07-17T08:41:32.9504348Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:32.9544285Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:32.9585868Z Entering 'third_party/psimd' 2025-07-17T08:41:32.9624911Z Entering 'third_party/pthreadpool' 2025-07-17T08:41:32.9665191Z Entering 'third_party/pybind11' 2025-07-17T08:41:32.9711073Z Entering 'third_party/python-peachpy' 2025-07-17T08:41:32.9756775Z Entering 'third_party/sleef' 2025-07-17T08:41:32.9800000Z Entering 'third_party/tensorpipe' 2025-07-17T08:41:32.9838944Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:32.9881389Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:32.9915241Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:32.9958812Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:33.0001031Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:33.0066861Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-07-17T08:41:33.0318092Z Entering 'android/libs/fbjni' 2025-07-17T08:41:33.0358515Z Entering 'third_party/FP16' 2025-07-17T08:41:33.0394379Z Entering 'third_party/FXdiv' 2025-07-17T08:41:33.0430620Z Entering 'third_party/NNPACK' 2025-07-17T08:41:33.0468885Z Entering 'third_party/NVTX' 2025-07-17T08:41:33.0507629Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T08:41:33.0544034Z Entering 'third_party/XNNPACK' 2025-07-17T08:41:33.0598024Z Entering 'third_party/aiter' 2025-07-17T08:41:33.0648060Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T08:41:33.0698035Z Entering 'third_party/benchmark' 2025-07-17T08:41:33.0739969Z Entering 'third_party/composable_kernel' 2025-07-17T08:41:33.0783884Z Entering 'third_party/cpp-httplib' 2025-07-17T08:41:33.0824993Z Entering 'third_party/cpuinfo' 2025-07-17T08:41:33.0867398Z Entering 'third_party/cudnn_frontend' 2025-07-17T08:41:33.0904246Z Entering 'third_party/cutlass' 2025-07-17T08:41:33.0951996Z Entering 'third_party/fbgemm' 2025-07-17T08:41:33.0992223Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T08:41:33.1031182Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T08:41:33.1072081Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T08:41:33.1113420Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T08:41:33.1156172Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T08:41:33.1197950Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T08:41:33.1238068Z Entering 'third_party/fbgemm/external/json' 2025-07-17T08:41:33.1282524Z Entering 'third_party/flash-attention' 2025-07-17T08:41:33.1332192Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T08:41:33.1375162Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T08:41:33.1429997Z Entering 'third_party/flatbuffers' 2025-07-17T08:41:33.1476685Z Entering 'third_party/fmt' 2025-07-17T08:41:33.1516662Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T08:41:33.1560966Z Entering 'third_party/gloo' 2025-07-17T08:41:33.1605050Z Entering 'third_party/googletest' 2025-07-17T08:41:33.1648306Z Entering 'third_party/ideep' 2025-07-17T08:41:33.1688268Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T08:41:33.1741197Z Entering 'third_party/ittapi' 2025-07-17T08:41:33.1782800Z Entering 'third_party/kineto' 2025-07-17T08:41:33.1821816Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T08:41:33.1859338Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T08:41:33.1896615Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T08:41:33.1943944Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T08:41:33.1982789Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T08:41:33.2020051Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T08:41:33.2065616Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T08:41:33.2102320Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T08:41:33.2136661Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T08:41:33.2172994Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T08:41:33.2212515Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T08:41:33.2255948Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T08:41:33.2296355Z Entering 'third_party/kleidiai' 2025-07-17T08:41:33.2342376Z Entering 'third_party/mimalloc' 2025-07-17T08:41:33.2382897Z Entering 'third_party/nlohmann' 2025-07-17T08:41:33.2423281Z Entering 'third_party/onnx' 2025-07-17T08:41:33.2477129Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T08:41:33.2525865Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T08:41:33.2566146Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T08:41:33.2607549Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T08:41:33.2641299Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T08:41:33.2680661Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T08:41:33.2721252Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T08:41:33.2758512Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T08:41:33.2797862Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T08:41:33.2834143Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T08:41:33.2870937Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T08:41:33.2908312Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T08:41:33.2965344Z Entering 'third_party/pocketfft' 2025-07-17T08:41:33.3007964Z Entering 'third_party/protobuf' 2025-07-17T08:41:33.3050281Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T08:41:33.3088276Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T08:41:33.3130694Z Entering 'third_party/psimd' 2025-07-17T08:41:33.3167862Z Entering 'third_party/pthreadpool' 2025-07-17T08:41:33.3209506Z Entering 'third_party/pybind11' 2025-07-17T08:41:33.3246198Z Entering 'third_party/python-peachpy' 2025-07-17T08:41:33.3292259Z Entering 'third_party/sleef' 2025-07-17T08:41:33.3331939Z Entering 'third_party/tensorpipe' 2025-07-17T08:41:33.3370528Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T08:41:33.3410811Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T08:41:33.3456635Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T08:41:33.3498568Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T08:41:33.3539857Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T08:41:33.3602079Z ##[endgroup] 2025-07-17T08:41:33.3695372Z [command]/usr/bin/git log -1 --format=%H 2025-07-17T08:41:33.3766705Z a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:41:33.4099607Z Prepare all required actions 2025-07-17T08:41:33.4100458Z Getting action download info 2025-07-17T08:41:33.6166754Z ##[group]Run ./.github/actions/setup-rocm 2025-07-17T08:41:33.6167308Z env: 2025-07-17T08:41:33.6167664Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.6168105Z ##[endgroup] 2025-07-17T08:41:33.6215094Z ##[group]Run dpkg -l | grep -E " rocm" 2025-07-17T08:41:33.6215676Z dpkg -l | grep -E " rocm" 2025-07-17T08:41:33.6265392Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:33.6266083Z env: 2025-07-17T08:41:33.6266435Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.6266878Z ##[endgroup] 2025-07-17T08:41:33.6475130Z ii rocm 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) software stack meta package 2025-07-17T08:41:33.6476356Z ii rocm-cmake 0.13.0.60201-112~22.04 amd64 rocm-cmake built using CMake 2025-07-17T08:41:33.6477534Z ii rocm-core 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6478665Z ii rocm-dbgapi 0.76.0.60201-112~22.04 amd64 Library to provide AMD GPU debugger API 2025-07-17T08:41:33.6480228Z ii rocm-debug-agent 2.0.3.60201-112~22.04 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent) 2025-07-17T08:41:33.6481707Z ii rocm-developer-tools 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6482980Z ii rocm-device-libs 1.0.0.60201-112~22.04 amd64 Radeon Open Compute - device libraries 2025-07-17T08:41:33.6483966Z ii rocm-gdb 14.2.60201-112~22.04 amd64 ROCgdb 2025-07-17T08:41:33.6485020Z ii rocm-hip-libraries 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6486228Z ii rocm-hip-runtime 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6487439Z ii rocm-hip-runtime-dev 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6488614Z ii rocm-hip-sdk 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6490295Z ii rocm-language-runtime 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6491419Z ii rocm-llvm 18.0.0.24355.60201-112~22.04 amd64 ROCm core compiler 2025-07-17T08:41:33.6492486Z ii rocm-ml-libraries 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6493624Z ii rocm-ml-sdk 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6494639Z ii rocm-opencl 2.0.0.60201-112~22.04 amd64 clr built using CMake 2025-07-17T08:41:33.6495598Z ii rocm-opencl-dev 2.0.0.60201-112~22.04 amd64 clr built using CMake 2025-07-17T08:41:33.6496735Z ii rocm-opencl-icd-loader 1.2.60201-112~22.04 amd64 OpenCL-ICD-Loader built using CMake 2025-07-17T08:41:33.6498062Z ii rocm-opencl-runtime 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6499996Z ii rocm-opencl-sdk 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6501374Z ii rocm-openmp-sdk 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) OpenMP Software development Kit. 2025-07-17T08:41:33.6502535Z ii rocm-smi-lib 7.3.0.60201-112~22.04 amd64 AMD System Management libraries 2025-07-17T08:41:33.6503593Z ii rocm-utils 6.2.1.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-07-17T08:41:33.6504715Z ii rocminfo 1.0.0.60201-112~22.04 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool 2025-07-17T08:41:33.6540206Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-07-17T08:41:33.6541150Z # ignore expansion of "docker ps -q" since it could be empty 2025-07-17T08:41:33.6541850Z # shellcheck disable=SC2046 2025-07-17T08:41:33.6542412Z docker stop $(docker ps -q) || true 2025-07-17T08:41:33.6542986Z # Prune all stopped containers. 2025-07-17T08:41:33.6543517Z docker container prune -f 2025-07-17T08:41:33.6589184Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:33.6589833Z env: 2025-07-17T08:41:33.6590206Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.6590626Z ##[endgroup] 2025-07-17T08:41:33.6911205Z "docker stop" requires at least 1 argument. 2025-07-17T08:41:33.6911767Z See 'docker stop --help'. 2025-07-17T08:41:33.6912049Z 2025-07-17T08:41:33.6912319Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-07-17T08:41:33.6912747Z 2025-07-17T08:41:33.6912923Z Stop one or more running containers 2025-07-17T08:41:33.7053848Z Total reclaimed space: 0B 2025-07-17T08:41:33.7141335Z ##[group]Run cat /etc/os-release || true 2025-07-17T08:41:33.7141925Z cat /etc/os-release || true 2025-07-17T08:41:33.7142557Z cat /etc/apt/sources.list.d/rocm.list || true 2025-07-17T08:41:33.7143196Z cat /opt/rocm/.info/version || true 2025-07-17T08:41:33.7143701Z whoami 2025-07-17T08:41:33.7195482Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:33.7196126Z env: 2025-07-17T08:41:33.7196495Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.7196934Z ##[endgroup] 2025-07-17T08:41:33.7290621Z PRETTY_NAME="Ubuntu 22.04.3 LTS" 2025-07-17T08:41:33.7291201Z NAME="Ubuntu" 2025-07-17T08:41:33.7291627Z VERSION_ID="22.04" 2025-07-17T08:41:33.7292648Z VERSION="22.04.3 LTS (Jammy Jellyfish)" 2025-07-17T08:41:33.7293178Z VERSION_CODENAME=jammy 2025-07-17T08:41:33.7293564Z ID=ubuntu 2025-07-17T08:41:33.7293892Z ID_LIKE=debian 2025-07-17T08:41:33.7294312Z HOME_URL="https://www.ubuntu.com/" 2025-07-17T08:41:33.7294882Z SUPPORT_URL="https://help.ubuntu.com/" 2025-07-17T08:41:33.7295505Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" 2025-07-17T08:41:33.7296418Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" 2025-07-17T08:41:33.7297301Z UBUNTU_CODENAME=jammy 2025-07-17T08:41:33.7308681Z deb [arch=amd64] https://repo.radeon.com/rocm/apt/6.2.1 jammy main 2025-07-17T08:41:33.7326215Z 6.2.1-112 2025-07-17T08:41:33.7336870Z pytorchci 2025-07-17T08:41:33.7374400Z ##[group]Run dpkg -l | grep -E " amdgpu" 2025-07-17T08:41:33.7374721Z dpkg -l | grep -E " amdgpu" 2025-07-17T08:41:33.7416337Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:33.7416668Z env: 2025-07-17T08:41:33.7416880Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.7417107Z ##[endgroup] 2025-07-17T08:41:33.7624727Z ii amdgpu-core 1:6.2.60201-2038383.22.04 all Core meta package for unified amdgpu driver. 2025-07-17T08:41:33.7625974Z ii amdgpu-dkms 1:6.8.5.60201-2038383.22.04 all amdgpu driver in DKMS format. 2025-07-17T08:41:33.7627740Z ii amdgpu-dkms-firmware 1:6.8.5.60201-2038383.22.04 all firmware blobs used by amdgpu driver in DKMS format 2025-07-17T08:41:33.7629053Z ii amdgpu-install 6.2.60201-2038383.22.04 all AMDGPU driver repository and installer 2025-07-17T08:41:33.7653725Z ##[group]Run rocm-smi 2025-07-17T08:41:33.7653993Z rocm-smi 2025-07-17T08:41:33.7698053Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:33.7698474Z env: 2025-07-17T08:41:33.7698703Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.7699010Z ##[endgroup] 2025-07-17T08:41:33.8536850Z 2025-07-17T08:41:33.8536912Z 2025-07-17T08:41:33.8537403Z ========================================= ROCm System Management Interface ========================================= 2025-07-17T08:41:33.8537913Z =================================================== Concise Info =================================================== 2025-07-17T08:41:33.8538493Z Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2025-07-17T08:41:33.8539357Z  (DID, GUID) (Edge) (Avg) (Mem, Compute, ID)  2025-07-17T08:41:33.8539876Z ==================================================================================================================== 2025-07-17T08:41:33.8541133Z 0 2 0x740f, 12261 40.0°C 41.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 300.0W 0% 0% 2025-07-17T08:41:33.8542299Z 1 3 0x740f, 36740 38.0°C 41.0W N/A, N/A, 0 800Mhz 1600Mhz 0% auto 300.0W 0% 0% 2025-07-17T08:41:33.8545219Z ==================================================================================================================== 2025-07-17T08:41:33.8545930Z =============================================== End of ROCm SMI Log ================================================ 2025-07-17T08:41:33.8734830Z ##[group]Run rocminfo 2025-07-17T08:41:33.8735283Z rocminfo 2025-07-17T08:41:33.8779923Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:33.8780564Z env: 2025-07-17T08:41:33.8780922Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.8781361Z ##[endgroup] 2025-07-17T08:41:33.9532468Z ROCk module version 6.8.5 is loaded 2025-07-17T08:41:33.9533289Z ===================== 2025-07-17T08:41:33.9533796Z HSA System Attributes 2025-07-17T08:41:33.9534232Z ===================== 2025-07-17T08:41:33.9534688Z Runtime Version: 1.14 2025-07-17T08:41:33.9535705Z Runtime Ext Version: 1.6 2025-07-17T08:41:33.9536248Z System Timestamp Freq.: 1000.000000MHz 2025-07-17T08:41:33.9537089Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-07-17T08:41:33.9537993Z Machine Model: LARGE 2025-07-17T08:41:33.9538691Z System Endianness: LITTLE 2025-07-17T08:41:33.9539312Z Mwaitx: DISABLED 2025-07-17T08:41:33.9539832Z DMAbuf Support: YES 2025-07-17T08:41:33.9540125Z 2025-07-17T08:41:33.9540272Z ========== 2025-07-17T08:41:33.9540686Z HSA Agents 2025-07-17T08:41:33.9541065Z ========== 2025-07-17T08:41:33.9541437Z ******* 2025-07-17T08:41:33.9541805Z Agent 1 2025-07-17T08:41:33.9542184Z ******* 2025-07-17T08:41:33.9542668Z Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:41:33.9543355Z Uuid: CPU-XX 2025-07-17T08:41:33.9544087Z Marketing Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:41:33.9544743Z Vendor Name: CPU 2025-07-17T08:41:33.9545380Z Feature: None specified 2025-07-17T08:41:33.9546021Z Profile: FULL_PROFILE 2025-07-17T08:41:33.9547105Z Float Round Mode: NEAR 2025-07-17T08:41:33.9547785Z Max Queue Number: 0(0x0) 2025-07-17T08:41:33.9548404Z Queue Min Size: 0(0x0) 2025-07-17T08:41:33.9549036Z Queue Max Size: 0(0x0) 2025-07-17T08:41:33.9549683Z Queue Type: MULTI 2025-07-17T08:41:33.9550276Z Node: 0 2025-07-17T08:41:33.9550854Z Device Type: CPU 2025-07-17T08:41:33.9551464Z Cache Info: 2025-07-17T08:41:33.9551932Z L1: 32768(0x8000) KB 2025-07-17T08:41:33.9552497Z Chip ID: 0(0x0) 2025-07-17T08:41:33.9553111Z ASIC Revision: 0(0x0) 2025-07-17T08:41:33.9553752Z Cacheline Size: 64(0x40) 2025-07-17T08:41:33.9554419Z Max Clock Freq. (MHz): 2600 2025-07-17T08:41:33.9555016Z BDFID: 0 2025-07-17T08:41:33.9555629Z Internal Node ID: 0 2025-07-17T08:41:33.9556299Z Compute Unit: 32 2025-07-17T08:41:33.9556911Z SIMDs per CU: 0 2025-07-17T08:41:33.9557544Z Shader Engines: 0 2025-07-17T08:41:33.9558206Z Shader Arrs. per Eng.: 0 2025-07-17T08:41:33.9558899Z WatchPts on Addr. Ranges:1 2025-07-17T08:41:33.9559621Z Memory Properties: 2025-07-17T08:41:33.9560097Z Features: None 2025-07-17T08:41:33.9560535Z Pool Info: 2025-07-17T08:41:33.9560961Z Pool 1 2025-07-17T08:41:33.9561510Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-07-17T08:41:33.9562148Z Size: 65790776(0x3ebe338) KB 2025-07-17T08:41:33.9562781Z Allocatable: TRUE 2025-07-17T08:41:33.9563461Z Alloc Granule: 4KB 2025-07-17T08:41:33.9564167Z Alloc Recommended Granule:4KB 2025-07-17T08:41:33.9564861Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9565538Z Accessible by all: TRUE 2025-07-17T08:41:33.9566451Z Pool 2 2025-07-17T08:41:33.9567013Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-07-17T08:41:33.9567647Z Size: 65790776(0x3ebe338) KB 2025-07-17T08:41:33.9568260Z Allocatable: TRUE 2025-07-17T08:41:33.9568928Z Alloc Granule: 4KB 2025-07-17T08:41:33.9569608Z Alloc Recommended Granule:4KB 2025-07-17T08:41:33.9570305Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9570981Z Accessible by all: TRUE 2025-07-17T08:41:33.9571549Z Pool 3 2025-07-17T08:41:33.9572068Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:41:33.9572716Z Size: 65790776(0x3ebe338) KB 2025-07-17T08:41:33.9573382Z Allocatable: TRUE 2025-07-17T08:41:33.9574115Z Alloc Granule: 4KB 2025-07-17T08:41:33.9574839Z Alloc Recommended Granule:4KB 2025-07-17T08:41:33.9575529Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9576486Z Accessible by all: TRUE 2025-07-17T08:41:33.9577104Z ISA Info: 2025-07-17T08:41:33.9577520Z ******* 2025-07-17T08:41:33.9577926Z Agent 2 2025-07-17T08:41:33.9578298Z ******* 2025-07-17T08:41:33.9578770Z Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:41:33.9579379Z Uuid: CPU-XX 2025-07-17T08:41:33.9580032Z Marketing Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:41:33.9580700Z Vendor Name: CPU 2025-07-17T08:41:33.9581336Z Feature: None specified 2025-07-17T08:41:33.9581975Z Profile: FULL_PROFILE 2025-07-17T08:41:33.9582608Z Float Round Mode: NEAR 2025-07-17T08:41:33.9583267Z Max Queue Number: 0(0x0) 2025-07-17T08:41:33.9583906Z Queue Min Size: 0(0x0) 2025-07-17T08:41:33.9584531Z Queue Max Size: 0(0x0) 2025-07-17T08:41:33.9585144Z Queue Type: MULTI 2025-07-17T08:41:33.9585738Z Node: 1 2025-07-17T08:41:33.9586334Z Device Type: CPU 2025-07-17T08:41:33.9586886Z Cache Info: 2025-07-17T08:41:33.9587350Z L1: 32768(0x8000) KB 2025-07-17T08:41:33.9587918Z Chip ID: 0(0x0) 2025-07-17T08:41:33.9588532Z ASIC Revision: 0(0x0) 2025-07-17T08:41:33.9589168Z Cacheline Size: 64(0x40) 2025-07-17T08:41:33.9589823Z Max Clock Freq. (MHz): 2600 2025-07-17T08:41:33.9590446Z BDFID: 0 2025-07-17T08:41:33.9591048Z Internal Node ID: 1 2025-07-17T08:41:33.9591695Z Compute Unit: 32 2025-07-17T08:41:33.9592311Z SIMDs per CU: 0 2025-07-17T08:41:33.9592938Z Shader Engines: 0 2025-07-17T08:41:33.9593581Z Shader Arrs. per Eng.: 0 2025-07-17T08:41:33.9594256Z WatchPts on Addr. Ranges:1 2025-07-17T08:41:33.9595164Z Memory Properties: 2025-07-17T08:41:33.9595595Z Features: None 2025-07-17T08:41:33.9596040Z Pool Info: 2025-07-17T08:41:33.9596524Z Pool 1 2025-07-17T08:41:33.9597058Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-07-17T08:41:33.9597702Z Size: 66046476(0x3efca0c) KB 2025-07-17T08:41:33.9598328Z Allocatable: TRUE 2025-07-17T08:41:33.9598998Z Alloc Granule: 4KB 2025-07-17T08:41:33.9599829Z Alloc Recommended Granule:4KB 2025-07-17T08:41:33.9600516Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9601193Z Accessible by all: TRUE 2025-07-17T08:41:33.9601757Z Pool 2 2025-07-17T08:41:33.9602278Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-07-17T08:41:33.9602909Z Size: 66046476(0x3efca0c) KB 2025-07-17T08:41:33.9603523Z Allocatable: TRUE 2025-07-17T08:41:33.9604286Z Alloc Granule: 4KB 2025-07-17T08:41:33.9605033Z Alloc Recommended Granule:4KB 2025-07-17T08:41:33.9606016Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9606706Z Accessible by all: TRUE 2025-07-17T08:41:33.9607269Z Pool 3 2025-07-17T08:41:33.9607783Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:41:33.9608382Z Size: 66046476(0x3efca0c) KB 2025-07-17T08:41:33.9608997Z Allocatable: TRUE 2025-07-17T08:41:33.9609656Z Alloc Granule: 4KB 2025-07-17T08:41:33.9610341Z Alloc Recommended Granule:4KB 2025-07-17T08:41:33.9611030Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9611689Z Accessible by all: TRUE 2025-07-17T08:41:33.9612263Z ISA Info: 2025-07-17T08:41:33.9612657Z ******* 2025-07-17T08:41:33.9613060Z Agent 3 2025-07-17T08:41:33.9613440Z ******* 2025-07-17T08:41:33.9613872Z Name: gfx90a 2025-07-17T08:41:33.9614477Z Uuid: GPU-57cd99ed4cd4643e 2025-07-17T08:41:33.9615109Z Marketing Name: AMD Instinct MI210 2025-07-17T08:41:33.9615769Z Vendor Name: AMD 2025-07-17T08:41:33.9616392Z Feature: KERNEL_DISPATCH 2025-07-17T08:41:33.9617044Z Profile: BASE_PROFILE 2025-07-17T08:41:33.9617681Z Float Round Mode: NEAR 2025-07-17T08:41:33.9618333Z Max Queue Number: 128(0x80) 2025-07-17T08:41:33.9618986Z Queue Min Size: 64(0x40) 2025-07-17T08:41:33.9619612Z Queue Max Size: 131072(0x20000) 2025-07-17T08:41:33.9620247Z Queue Type: MULTI 2025-07-17T08:41:33.9620827Z Node: 2 2025-07-17T08:41:33.9621418Z Device Type: GPU 2025-07-17T08:41:33.9621964Z Cache Info: 2025-07-17T08:41:33.9622420Z L1: 16(0x10) KB 2025-07-17T08:41:33.9622971Z L2: 8192(0x2000) KB 2025-07-17T08:41:33.9623516Z Chip ID: 29711(0x740f) 2025-07-17T08:41:33.9624444Z ASIC Revision: 1(0x1) 2025-07-17T08:41:33.9625078Z Cacheline Size: 64(0x40) 2025-07-17T08:41:33.9625733Z Max Clock Freq. (MHz): 1700 2025-07-17T08:41:33.9626339Z BDFID: 768 2025-07-17T08:41:33.9626967Z Internal Node ID: 2 2025-07-17T08:41:33.9627608Z Compute Unit: 104 2025-07-17T08:41:33.9628214Z SIMDs per CU: 4 2025-07-17T08:41:33.9628851Z Shader Engines: 8 2025-07-17T08:41:33.9629504Z Shader Arrs. per Eng.: 1 2025-07-17T08:41:33.9630187Z WatchPts on Addr. Ranges:4 2025-07-17T08:41:33.9630894Z Coherent Host Access: FALSE 2025-07-17T08:41:33.9631510Z Memory Properties: 2025-07-17T08:41:33.9631997Z Features: KERNEL_DISPATCH 2025-07-17T08:41:33.9632594Z Fast F16 Operation: TRUE 2025-07-17T08:41:33.9633263Z Wavefront Size: 64(0x40) 2025-07-17T08:41:33.9633923Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:41:33.9634822Z Workgroup Max Size per Dimension: 2025-07-17T08:41:33.9635339Z x 1024(0x400) 2025-07-17T08:41:33.9635878Z y 1024(0x400) 2025-07-17T08:41:33.9636413Z z 1024(0x400) 2025-07-17T08:41:33.9637000Z Max Waves Per CU: 32(0x20) 2025-07-17T08:41:33.9637669Z Max Work-item Per CU: 2048(0x800) 2025-07-17T08:41:33.9638306Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:41:33.9638905Z Grid Max Size per Dimension: 2025-07-17T08:41:33.9639365Z x 4294967295(0xffffffff) 2025-07-17T08:41:33.9640016Z y 4294967295(0xffffffff) 2025-07-17T08:41:33.9640547Z z 4294967295(0xffffffff) 2025-07-17T08:41:33.9641192Z Max fbarriers/Workgrp: 32 2025-07-17T08:41:33.9650612Z Packet Processor uCode:: 83 2025-07-17T08:41:33.9651429Z SDMA engine uCode:: 8 2025-07-17T08:41:33.9652140Z IOMMU Support:: None 2025-07-17T08:41:33.9652758Z Pool Info: 2025-07-17T08:41:33.9653221Z Pool 1 2025-07-17T08:41:33.9653770Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:41:33.9654437Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:41:33.9655088Z Allocatable: TRUE 2025-07-17T08:41:33.9655739Z Alloc Granule: 4KB 2025-07-17T08:41:33.9656446Z Alloc Recommended Granule:2048KB 2025-07-17T08:41:33.9657138Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9657829Z Accessible by all: FALSE 2025-07-17T08:41:33.9658408Z Pool 2 2025-07-17T08:41:33.9658951Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-07-17T08:41:33.9659616Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:41:33.9660260Z Allocatable: TRUE 2025-07-17T08:41:33.9660942Z Alloc Granule: 4KB 2025-07-17T08:41:33.9661625Z Alloc Recommended Granule:2048KB 2025-07-17T08:41:33.9662785Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9663462Z Accessible by all: FALSE 2025-07-17T08:41:33.9664059Z Pool 3 2025-07-17T08:41:33.9664591Z Segment: GROUP 2025-07-17T08:41:33.9665198Z Size: 64(0x40) KB 2025-07-17T08:41:33.9665864Z Allocatable: FALSE 2025-07-17T08:41:33.9666512Z Alloc Granule: 0KB 2025-07-17T08:41:33.9667213Z Alloc Recommended Granule:0KB 2025-07-17T08:41:33.9667900Z Alloc Alignment: 0KB 2025-07-17T08:41:33.9668587Z Accessible by all: FALSE 2025-07-17T08:41:33.9669181Z ISA Info: 2025-07-17T08:41:33.9669606Z ISA 1 2025-07-17T08:41:33.9670162Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-07-17T08:41:33.9670881Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-07-17T08:41:33.9671585Z Profiles: HSA_PROFILE_BASE 2025-07-17T08:41:33.9672270Z Default Rounding Mode: NEAR 2025-07-17T08:41:33.9673288Z Default Rounding Mode: NEAR 2025-07-17T08:41:33.9673957Z Fast f16: TRUE 2025-07-17T08:41:33.9674607Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:41:33.9675247Z Workgroup Max Size per Dimension: 2025-07-17T08:41:33.9675795Z x 1024(0x400) 2025-07-17T08:41:33.9676361Z y 1024(0x400) 2025-07-17T08:41:33.9676882Z z 1024(0x400) 2025-07-17T08:41:33.9677497Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:41:33.9678092Z Grid Max Size per Dimension: 2025-07-17T08:41:33.9678606Z x 4294967295(0xffffffff) 2025-07-17T08:41:33.9679150Z y 4294967295(0xffffffff) 2025-07-17T08:41:33.9679851Z z 4294967295(0xffffffff) 2025-07-17T08:41:33.9680461Z FBarrier Max Size: 32 2025-07-17T08:41:33.9681030Z ******* 2025-07-17T08:41:33.9681434Z Agent 4 2025-07-17T08:41:33.9681814Z ******* 2025-07-17T08:41:33.9682276Z Name: gfx90a 2025-07-17T08:41:33.9682898Z Uuid: GPU-6b7c2bd8af06ff6a 2025-07-17T08:41:33.9683546Z Marketing Name: AMD Instinct MI210 2025-07-17T08:41:33.9684221Z Vendor Name: AMD 2025-07-17T08:41:33.9684845Z Feature: KERNEL_DISPATCH 2025-07-17T08:41:33.9685482Z Profile: BASE_PROFILE 2025-07-17T08:41:33.9686118Z Float Round Mode: NEAR 2025-07-17T08:41:33.9686787Z Max Queue Number: 128(0x80) 2025-07-17T08:41:33.9687427Z Queue Min Size: 64(0x40) 2025-07-17T08:41:33.9688137Z Queue Max Size: 131072(0x20000) 2025-07-17T08:41:33.9688881Z Queue Type: MULTI 2025-07-17T08:41:33.9689568Z Node: 3 2025-07-17T08:41:33.9690275Z Device Type: GPU 2025-07-17T08:41:33.9690928Z Cache Info: 2025-07-17T08:41:33.9691849Z L1: 16(0x10) KB 2025-07-17T08:41:33.9692487Z L2: 8192(0x2000) KB 2025-07-17T08:41:33.9693159Z Chip ID: 29711(0x740f) 2025-07-17T08:41:33.9693886Z ASIC Revision: 1(0x1) 2025-07-17T08:41:33.9694741Z Cacheline Size: 64(0x40) 2025-07-17T08:41:33.9695403Z Max Clock Freq. (MHz): 1700 2025-07-17T08:41:33.9696013Z BDFID: 33536 2025-07-17T08:41:33.9696640Z Internal Node ID: 3 2025-07-17T08:41:33.9697276Z Compute Unit: 104 2025-07-17T08:41:33.9697899Z SIMDs per CU: 4 2025-07-17T08:41:33.9698567Z Shader Engines: 8 2025-07-17T08:41:33.9699226Z Shader Arrs. per Eng.: 1 2025-07-17T08:41:33.9699957Z WatchPts on Addr. Ranges:4 2025-07-17T08:41:33.9700778Z Coherent Host Access: FALSE 2025-07-17T08:41:33.9701504Z Memory Properties: 2025-07-17T08:41:33.9702065Z Features: KERNEL_DISPATCH 2025-07-17T08:41:33.9702794Z Fast F16 Operation: TRUE 2025-07-17T08:41:33.9703849Z Wavefront Size: 64(0x40) 2025-07-17T08:41:33.9704538Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:41:33.9705163Z Workgroup Max Size per Dimension: 2025-07-17T08:41:33.9705667Z x 1024(0x400) 2025-07-17T08:41:33.9706199Z y 1024(0x400) 2025-07-17T08:41:33.9706711Z z 1024(0x400) 2025-07-17T08:41:33.9707321Z Max Waves Per CU: 32(0x20) 2025-07-17T08:41:33.9707969Z Max Work-item Per CU: 2048(0x800) 2025-07-17T08:41:33.9708635Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:41:33.9709230Z Grid Max Size per Dimension: 2025-07-17T08:41:33.9709690Z x 4294967295(0xffffffff) 2025-07-17T08:41:33.9710247Z y 4294967295(0xffffffff) 2025-07-17T08:41:33.9710791Z z 4294967295(0xffffffff) 2025-07-17T08:41:33.9711430Z Max fbarriers/Workgrp: 32 2025-07-17T08:41:33.9712139Z Packet Processor uCode:: 83 2025-07-17T08:41:33.9712828Z SDMA engine uCode:: 8 2025-07-17T08:41:33.9713492Z IOMMU Support:: None 2025-07-17T08:41:33.9714054Z Pool Info: 2025-07-17T08:41:33.9714485Z Pool 1 2025-07-17T08:41:33.9715014Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:41:33.9715668Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:41:33.9716285Z Allocatable: TRUE 2025-07-17T08:41:33.9716939Z Alloc Granule: 4KB 2025-07-17T08:41:33.9717641Z Alloc Recommended Granule:2048KB 2025-07-17T08:41:33.9718330Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9719005Z Accessible by all: FALSE 2025-07-17T08:41:33.9719678Z Pool 2 2025-07-17T08:41:33.9720217Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-07-17T08:41:33.9720861Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:41:33.9721489Z Allocatable: TRUE 2025-07-17T08:41:33.9722473Z Alloc Granule: 4KB 2025-07-17T08:41:33.9723167Z Alloc Recommended Granule:2048KB 2025-07-17T08:41:33.9723867Z Alloc Alignment: 4KB 2025-07-17T08:41:33.9724536Z Accessible by all: FALSE 2025-07-17T08:41:33.9725136Z Pool 3 2025-07-17T08:41:33.9725627Z Segment: GROUP 2025-07-17T08:41:33.9726231Z Size: 64(0x40) KB 2025-07-17T08:41:33.9726850Z Allocatable: FALSE 2025-07-17T08:41:33.9727495Z Alloc Granule: 0KB 2025-07-17T08:41:33.9728197Z Alloc Recommended Granule:0KB 2025-07-17T08:41:33.9728884Z Alloc Alignment: 0KB 2025-07-17T08:41:33.9729579Z Accessible by all: FALSE 2025-07-17T08:41:33.9730150Z ISA Info: 2025-07-17T08:41:33.9730575Z ISA 1 2025-07-17T08:41:33.9731105Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-07-17T08:41:33.9731819Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-07-17T08:41:33.9732797Z Profiles: HSA_PROFILE_BASE 2025-07-17T08:41:33.9733490Z Default Rounding Mode: NEAR 2025-07-17T08:41:33.9734201Z Default Rounding Mode: NEAR 2025-07-17T08:41:33.9734843Z Fast f16: TRUE 2025-07-17T08:41:33.9735499Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:41:33.9736119Z Workgroup Max Size per Dimension: 2025-07-17T08:41:33.9736696Z x 1024(0x400) 2025-07-17T08:41:33.9737248Z y 1024(0x400) 2025-07-17T08:41:33.9737768Z z 1024(0x400) 2025-07-17T08:41:33.9738379Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:41:33.9738961Z Grid Max Size per Dimension: 2025-07-17T08:41:33.9739486Z x 4294967295(0xffffffff) 2025-07-17T08:41:33.9740019Z y 4294967295(0xffffffff) 2025-07-17T08:41:33.9740547Z z 4294967295(0xffffffff) 2025-07-17T08:41:33.9741147Z FBarrier Max Size: 32 2025-07-17T08:41:33.9741707Z *** Done *** 2025-07-17T08:41:33.9777759Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-07-17T08:41:33.9778513Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-07-17T08:41:33.9779802Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-07-17T08:41:33.9780985Z if [[ $ngpu -eq 0 ]]; then 2025-07-17T08:41:33.9781598Z  echo "Error: Failed to detect any GPUs on the runner" 2025-07-17T08:41:33.9782255Z  echo "$msg" 2025-07-17T08:41:33.9782680Z  exit 1 2025-07-17T08:41:33.9783066Z fi 2025-07-17T08:41:33.9783428Z if [[ $ngpu -eq 1 ]]; then 2025-07-17T08:41:33.9784194Z  echo "Error: only 1 GPU detected, at least 2 GPUs are needed for distributed jobs" 2025-07-17T08:41:33.9784988Z  echo "$msg" 2025-07-17T08:41:33.9785394Z  exit 1 2025-07-17T08:41:33.9785780Z fi 2025-07-17T08:41:33.9831731Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:33.9832377Z env: 2025-07-17T08:41:33.9832739Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:33.9833530Z ##[endgroup] 2025-07-17T08:41:34.0846744Z ##[group]Run pytorch/pytorch/.github/actions/diskspace-cleanup@main 2025-07-17T08:41:34.0847449Z with: 2025-07-17T08:41:34.0847870Z diskspace-cutoff: 70 2025-07-17T08:41:34.0848353Z env: 2025-07-17T08:41:34.0848777Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:34.0849267Z ##[endgroup] 2025-07-17T08:41:34.0907185Z ##[group]Run set -ex 2025-07-17T08:41:34.0907777Z set -ex 2025-07-17T08:41:34.0908253Z diskspace_cutoff=70 2025-07-17T08:41:34.0909015Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-07-17T08:41:34.0909850Z if [ ! -d "$docker_root_dir" ]; then 2025-07-17T08:41:34.0910893Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-07-17T08:41:34.0911721Z  exit 0 2025-07-17T08:41:34.0912101Z fi 2025-07-17T08:41:34.0912819Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-07-17T08:41:34.0914360Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-07-17T08:41:34.0915656Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-07-17T08:41:34.0916288Z  docker system prune -af 2025-07-17T08:41:34.0917537Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-07-17T08:41:34.0918537Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-07-17T08:41:34.0919675Z  echo "Error: Available diskspace is less than $diskspace_cutoff percent. Not enough diskspace." 2025-07-17T08:41:34.0920568Z  echo "$msg" 2025-07-17T08:41:34.0920994Z  exit 1 2025-07-17T08:41:34.0921397Z  else 2025-07-17T08:41:34.0921865Z  difference=$((diskspace - diskspace_new)) 2025-07-17T08:41:34.0922569Z  echo "Diskspace saved: $difference percent" 2025-07-17T08:41:34.0923131Z  fi 2025-07-17T08:41:34.0923474Z fi 2025-07-17T08:41:34.0971520Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:34.0972283Z env: 2025-07-17T08:41:34.0972729Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:34.0973229Z ##[endgroup] 2025-07-17T08:41:34.1043377Z + diskspace_cutoff=70 2025-07-17T08:41:34.1049868Z ++ docker info -f '{{.DockerRootDir}}' 2025-07-17T08:41:34.1570180Z + docker_root_dir=/home/pytorchci/.local/share/docker 2025-07-17T08:41:34.1570638Z + '[' '!' -d /home/pytorchci/.local/share/docker ']' 2025-07-17T08:41:34.1583661Z ++ df -H --output=pcent /home/pytorchci/.local/share/docker 2025-07-17T08:41:34.1584217Z ++ sed s/%// 2025-07-17T08:41:34.1584523Z ++ sed -n 2p 2025-07-17T08:41:34.1584855Z ++ sed 's/ //' 2025-07-17T08:41:34.1603306Z + diskspace=40 2025-07-17T08:41:34.1604229Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-07-17T08:41:34.1605193Z + [[ 40 -ge 70 ]] 2025-07-17T08:41:34.1638556Z ##[group]Run RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-07-17T08:41:34.1638946Z RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-07-17T08:41:34.1639234Z rm -rf "${RUNNER_ARTIFACT_DIR}" 2025-07-17T08:41:34.1639555Z mkdir -p "${RUNNER_ARTIFACT_DIR}" 2025-07-17T08:41:34.1639920Z echo "RUNNER_ARTIFACT_DIR=${RUNNER_ARTIFACT_DIR}" >> "${GITHUB_ENV}" 2025-07-17T08:41:34.1640234Z  2025-07-17T08:41:34.1640463Z RUNNER_TEST_RESULTS_DIR="${RUNNER_TEMP}/test-results" 2025-07-17T08:41:34.1640757Z rm -rf "${RUNNER_TEST_RESULTS_DIR}" 2025-07-17T08:41:34.1641009Z mkdir -p "${RUNNER_TEST_RESULTS_DIR}" 2025-07-17T08:41:34.1641346Z echo "RUNNER_TEST_RESULTS_DIR=${RUNNER_TEST_RESULTS_DIR}" >> "${GITHUB_ENV}" 2025-07-17T08:41:34.1641656Z  2025-07-17T08:41:34.1642091Z RUNNER_DOCS_DIR="${RUNNER_TEMP}/docs" 2025-07-17T08:41:34.1642322Z rm -rf "${RUNNER_DOCS_DIR}" 2025-07-17T08:41:34.1642543Z mkdir -p "${RUNNER_DOCS_DIR}" 2025-07-17T08:41:34.1642824Z echo "RUNNER_DOCS_DIR=${RUNNER_DOCS_DIR}" >> "${GITHUB_ENV}" 2025-07-17T08:41:34.1665059Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:34.1665353Z env: 2025-07-17T08:41:34.1665536Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:34.1665727Z ##[endgroup] 2025-07-17T08:41:34.1810062Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-07-17T08:41:34.1810475Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-07-17T08:41:34.1810831Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-07-17T08:41:34.1832270Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:34.1832542Z env: 2025-07-17T08:41:34.1832712Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:34.1833045Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:34.1833474Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:34.1833875Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:34.1834149Z ##[endgroup] 2025-07-17T08:41:34.1937395Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-07-17T08:41:34.1938067Z # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-07-17T08:41:34.1938549Z # Add render group for container creation. 2025-07-17T08:41:34.1938941Z render_gid=`cat /etc/group | grep render | cut -d: -f3` 2025-07-17T08:41:34.1939565Z # Ensure GPU isolation if pod is part of kubernetes setup with DEVICE_FLAG. 2025-07-17T08:41:34.1940610Z if [ -f "/etc/podinfo/gha-render-devices" ]; then 2025-07-17T08:41:34.1941500Z  DEVICE_FLAG=$(cat /etc/podinfo/gha-render-devices) 2025-07-17T08:41:34.1942224Z else 2025-07-17T08:41:34.1942636Z  DEVICE_FLAG="--device /dev/dri" 2025-07-17T08:41:34.1943140Z fi 2025-07-17T08:41:34.1943947Z # The --group-add daemon and --group-add bin are needed in the Ubuntu 24.04 and Almalinux OSs respectively. 2025-07-17T08:41:34.1945225Z # This is due to the device files (/dev/kfd & /dev/dri) being owned by video group on bare metal. 2025-07-17T08:41:34.1946390Z # This video group ID maps to subgid 1 inside the docker image due to the /etc/subgid entries. 2025-07-17T08:41:34.1947607Z # The group name corresponding to group ID 1 can change depending on the OS, so both are necessary. 2025-07-17T08:41:34.1949696Z echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd $DEVICE_FLAG --group-add video --group-add $render_gid --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host" >> "${GITHUB_ENV}" 2025-07-17T08:41:34.1992791Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:34.1993429Z env: 2025-07-17T08:41:34.1993795Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:34.1994475Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:34.1995486Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:34.1996563Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:34.1997336Z ##[endgroup] 2025-07-17T08:41:34.2194625Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-07-17T08:41:34.2195039Z with: 2025-07-17T08:41:34.2195301Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only 2025-07-17T08:41:34.2195626Z aws-region: us-east-1 2025-07-17T08:41:34.2195826Z role-duration-seconds: 18000 2025-07-17T08:41:34.2196035Z audience: sts.amazonaws.com 2025-07-17T08:41:34.2196471Z env: 2025-07-17T08:41:34.2196622Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:34.2196924Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:34.2197344Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:34.2197742Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:34.2198462Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:34.2199065Z ##[endgroup] 2025-07-17T08:41:34.6174842Z Assuming role with OIDC 2025-07-17T08:41:34.9738129Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions 2025-07-17T08:41:35.0857919Z ##[group]Run aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 2025-07-17T08:41:35.0858792Z with: 2025-07-17T08:41:35.0859197Z mask-password: true 2025-07-17T08:41:35.0859696Z registry-type: private 2025-07-17T08:41:35.0860148Z skip-logout: false 2025-07-17T08:41:35.0860553Z env: 2025-07-17T08:41:35.0860948Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:35.0861664Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:35.0862711Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:35.0863670Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:35.0865353Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:35.0866869Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:35.0867418Z AWS_REGION: us-east-1 2025-07-17T08:41:35.0868458Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:35.0869129Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:35.0878829Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:35.0879285Z ##[endgroup] 2025-07-17T08:41:35.5698804Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:36.3071852Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-07-17T08:41:36.3072857Z with: 2025-07-17T08:41:36.3074335Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3075904Z use-custom-docker-registry: true 2025-07-17T08:41:36.3076448Z docker-build-dir: .ci/docker 2025-07-17T08:41:36.3076971Z docker-build-script: ./build.sh 2025-07-17T08:41:36.3077483Z working-directory: . 2025-07-17T08:41:36.3078090Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:36.3078781Z force-push: false 2025-07-17T08:41:36.3079179Z env: 2025-07-17T08:41:36.3079819Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:36.3080580Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:36.3081871Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:36.3083065Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:36.3084830Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:36.3086344Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:36.3086822Z AWS_REGION: us-east-1 2025-07-17T08:41:36.3087487Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:36.3088137Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:36.3096928Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:36.3097167Z ##[endgroup] 2025-07-17T08:41:36.3131960Z ##[group]Run set -ex 2025-07-17T08:41:36.3132580Z set -ex 2025-07-17T08:41:36.3133044Z  2025-07-17T08:41:36.3133858Z # If the docker build directory or the build script doesn't exist, the action will 2025-07-17T08:41:36.3135199Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-07-17T08:41:36.3135709Z # job could then download the pre-built image as usual 2025-07-17T08:41:36.3136306Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-07-17T08:41:36.3136867Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3137162Z else 2025-07-17T08:41:36.3137407Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3137801Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3138162Z  2025-07-17T08:41:36.3138638Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-07-17T08:41:36.3139203Z  exit 0 2025-07-17T08:41:36.3139416Z fi 2025-07-17T08:41:36.3139617Z  2025-07-17T08:41:36.3139919Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-07-17T08:41:36.3140477Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-07-17T08:41:36.3141153Z  # use it as it is, but first let's extract the tag 2025-07-17T08:41:36.3142152Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-07-17T08:41:36.3143187Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3144070Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3144745Z else 2025-07-17T08:41:36.3145204Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-07-17T08:41:36.3145858Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-07-17T08:41:36.3146512Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-07-17T08:41:36.3147081Z  fi 2025-07-17T08:41:36.3148220Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-07-17T08:41:36.3149265Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3150335Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3151505Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3152223Z fi 2025-07-17T08:41:36.3199722Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:36.3200409Z env: 2025-07-17T08:41:36.3200792Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:36.3201522Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:36.3202601Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:36.3203597Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:36.3205331Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:36.3206858Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:36.3207351Z AWS_REGION: us-east-1 2025-07-17T08:41:36.3207932Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:36.3208729Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:36.3217107Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:36.3217346Z REPO_NAME: pytorch 2025-07-17T08:41:36.3217993Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3218698Z DOCKER_BUILD_DIR: .ci/docker 2025-07-17T08:41:36.3218965Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-07-17T08:41:36.3219304Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:36.3219865Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-07-17T08:41:36.3220128Z CUSTOM_TAG_PREFIX: 2025-07-17T08:41:36.3220358Z ##[endgroup] 2025-07-17T08:41:36.3287921Z + [[ -d .ci/docker ]] 2025-07-17T08:41:36.3288536Z + [[ -f .ci/docker/./build.sh ]] 2025-07-17T08:41:36.3289168Z + [[ true == \t\r\u\e ]] 2025-07-17T08:41:36.3289704Z + echo skip=false 2025-07-17T08:41:36.3291730Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-07-17T08:41:36.3304474Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3305896Z ++ awk -F '[:,]' '{print $2}' 2025-07-17T08:41:36.3336053Z + DOCKER_TAG=pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3336771Z + echo docker-tag=pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3337740Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3391872Z ##[group]Run set +e 2025-07-17T08:41:36.3392863Z set +e 2025-07-17T08:41:36.3393439Z set -x 2025-07-17T08:41:36.3394128Z  2025-07-17T08:41:36.3394742Z login() { 2025-07-17T08:41:36.3395817Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-07-17T08:41:36.3396978Z } 2025-07-17T08:41:36.3397565Z  2025-07-17T08:41:36.3398193Z retry () { 2025-07-17T08:41:36.3398908Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-07-17T08:41:36.3399707Z } 2025-07-17T08:41:36.3400344Z  2025-07-17T08:41:36.3401107Z retry login "${DOCKER_REGISTRY}" 2025-07-17T08:41:36.3401822Z  2025-07-17T08:41:36.3402540Z START_TIME=$(date +%s) 2025-07-17T08:41:36.3403260Z # Wait up to 120 minutes 2025-07-17T08:41:36.3404559Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-07-17T08:41:36.3405688Z  # Check if image already exists, if it does then skip building it 2025-07-17T08:41:36.3406745Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-07-17T08:41:36.3407669Z  exit 0 2025-07-17T08:41:36.3408352Z  fi 2025-07-17T08:41:36.3409033Z  2025-07-17T08:41:36.3410271Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-07-17T08:41:36.3411846Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-07-17T08:41:36.3413460Z  # latter, it will wait for the Docker images to become available before continuing 2025-07-17T08:41:36.3414621Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-07-17T08:41:36.3415176Z  # It's a Docker build job, let's build the image 2025-07-17T08:41:36.3415565Z  break 2025-07-17T08:41:36.3415910Z  else 2025-07-17T08:41:36.3416343Z  # It's a regular build job, wait for the image to become available 2025-07-17T08:41:36.3416840Z  sleep 300 2025-07-17T08:41:36.3417168Z  fi 2025-07-17T08:41:36.3417468Z done 2025-07-17T08:41:36.3417882Z  2025-07-17T08:41:36.3418321Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-07-17T08:41:36.3418923Z # be empty. The default action would be to continue rebuild the image 2025-07-17T08:41:36.3419499Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-07-17T08:41:36.3420091Z  # if we're on the base branch then use the parent commit 2025-07-17T08:41:36.3421129Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-07-17T08:41:36.3422483Z else 2025-07-17T08:41:36.3423275Z  # otherwise we're on a PR, so use the most recent base commit 2025-07-17T08:41:36.3424426Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-07-17T08:41:36.3425239Z fi 2025-07-17T08:41:36.3425800Z  2025-07-17T08:41:36.3426530Z if [[ -z "${MERGE_BASE}" ]]; then 2025-07-17T08:41:36.3427354Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3428137Z  2025-07-17T08:41:36.3429191Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-07-17T08:41:36.3430325Z  exit 0 2025-07-17T08:41:36.3431004Z fi 2025-07-17T08:41:36.3431573Z  2025-07-17T08:41:36.3432274Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-07-17T08:41:36.3433748Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-07-17T08:41:36.3434954Z  exit 1 2025-07-17T08:41:36.3435702Z fi 2025-07-17T08:41:36.3436308Z  2025-07-17T08:41:36.3437131Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-07-17T08:41:36.3438505Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-07-17T08:41:36.3439778Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-07-17T08:41:36.3441140Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-07-17T08:41:36.3442708Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-07-17T08:41:36.3443598Z fi 2025-07-17T08:41:36.3444159Z  2025-07-17T08:41:36.3444886Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-07-17T08:41:36.3495164Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:36.3495606Z env: 2025-07-17T08:41:36.3495940Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:36.3496636Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:36.3497327Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:36.3497946Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:36.3499000Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:36.3499910Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:36.3500376Z AWS_REGION: us-east-1 2025-07-17T08:41:36.3501552Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:36.3502612Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:36.3512805Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:36.3513429Z DOCKER_BUILD_DIR: .ci/docker 2025-07-17T08:41:36.3514193Z BASE_REVISION: a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:41:36.3516003Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3517881Z DOCKER_TAG: pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:36.3519146Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:36.3520233Z DOCKER_PUSH: 2025-07-17T08:41:36.3520837Z ##[endgroup] 2025-07-17T08:41:36.3598689Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:36.3600138Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:36.3604085Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:36.3605533Z + aws ecr get-login-password --region us-east-1 2025-07-17T08:41:37.8001641Z WARNING! Your password will be stored unencrypted in /home/pytorchci/.docker/config.json. 2025-07-17T08:41:37.8003633Z Configure a credential helper to remove this warning. See 2025-07-17T08:41:37.8037247Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-07-17T08:41:37.8038095Z 2025-07-17T08:41:37.8038322Z Login Succeeded 2025-07-17T08:41:37.8039040Z ++ date +%s 2025-07-17T08:41:37.8048894Z + START_TIME=1752741697 2025-07-17T08:41:37.8053717Z ++ date +%s 2025-07-17T08:41:37.8062089Z + [[ 1752734497 -lt 1752741697 ]] 2025-07-17T08:41:37.8063600Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:39.2066307Z { 2025-07-17T08:41:39.2066788Z "schemaVersion": 2, 2025-07-17T08:41:39.2067600Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-07-17T08:41:39.2068384Z "config": { 2025-07-17T08:41:39.2068984Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-07-17T08:41:39.2069701Z "size": 28418, 2025-07-17T08:41:39.2070467Z "digest": "sha256:0f464723fd0e297450625b7697dd0dc4ef837960dceaa5bd0f2f4d7d8b9987eb" 2025-07-17T08:41:39.2071275Z }, 2025-07-17T08:41:39.2071633Z "layers": [ 2025-07-17T08:41:39.2072010Z { 2025-07-17T08:41:39.2072589Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2073315Z "size": 30446997, 2025-07-17T08:41:39.2074032Z "digest": "sha256:66587c81b81a58d07e40c48d900a1517516bbf58c4378c687d89d645824f5e5f" 2025-07-17T08:41:39.2074794Z }, 2025-07-17T08:41:39.2075122Z { 2025-07-17T08:41:39.2075682Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2076379Z "size": 1554, 2025-07-17T08:41:39.2077245Z "digest": "sha256:a3c451a3328b650c5ddade4025cc44a20eea3fa108daf68fb805d5038f4b327a" 2025-07-17T08:41:39.2078031Z }, 2025-07-17T08:41:39.2078354Z { 2025-07-17T08:41:39.2078898Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2079873Z "size": 313270340, 2025-07-17T08:41:39.2080737Z "digest": "sha256:74e64bc968c1cf8f97d841b7ea0266de87fa2e1b8b4254f71e0350252bcc7f1a" 2025-07-17T08:41:39.2081627Z }, 2025-07-17T08:41:39.2082448Z { 2025-07-17T08:41:39.2083015Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2083700Z "size": 703, 2025-07-17T08:41:39.2084385Z "digest": "sha256:9bcb8a79b45b53f1f900b5be887bf24d1166b93e73447e1ed1d7d17265eec376" 2025-07-17T08:41:39.2085165Z }, 2025-07-17T08:41:39.2085483Z { 2025-07-17T08:41:39.2086018Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2086693Z "size": 1212, 2025-07-17T08:41:39.2087369Z "digest": "sha256:35c764410de1133f1414d9b3f52ac20cc4a04bc5dad2555fce5fed7fc7b53e27" 2025-07-17T08:41:39.2088138Z }, 2025-07-17T08:41:39.2088451Z { 2025-07-17T08:41:39.2088968Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2089648Z "size": 485, 2025-07-17T08:41:39.2090301Z "digest": "sha256:501f1b9049b569de8832d393bd28cb9461fb1a5748f254493059d06f95078d42" 2025-07-17T08:41:39.2091059Z }, 2025-07-17T08:41:39.2091372Z { 2025-07-17T08:41:39.2091905Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2092581Z "size": 110343714, 2025-07-17T08:41:39.2093270Z "digest": "sha256:52662c4baedc58d84a36538f1ec46a58ef357581acf221294a5b5400df3613a7" 2025-07-17T08:41:39.2094028Z }, 2025-07-17T08:41:39.2094336Z { 2025-07-17T08:41:39.2094844Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2095512Z "size": 4084, 2025-07-17T08:41:39.2096171Z "digest": "sha256:e176b5597377db9ba0da10a2949417bd291ab8c8880d4cf810bfa46963514355" 2025-07-17T08:41:39.2096932Z }, 2025-07-17T08:41:39.2097244Z { 2025-07-17T08:41:39.2097763Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2098432Z "size": 1709, 2025-07-17T08:41:39.2099084Z "digest": "sha256:0d668fce00b22755f15b2c6a7185f5e4cd9e016804fd763b9c213c24330ad63e" 2025-07-17T08:41:39.2100354Z }, 2025-07-17T08:41:39.2100732Z { 2025-07-17T08:41:39.2101341Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2102016Z "size": 724, 2025-07-17T08:41:39.2102681Z "digest": "sha256:d81f985aa40d5111b424e8084dfa90790c3078ab270debfa5b46f37c2423c5ce" 2025-07-17T08:41:39.2103438Z }, 2025-07-17T08:41:39.2103739Z { 2025-07-17T08:41:39.2104268Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2104950Z "size": 3310464312, 2025-07-17T08:41:39.2105657Z "digest": "sha256:ec9c560bb82fd80fe1a703d8635753191c3bf91ce72bf3ef3d7c988cdc60709f" 2025-07-17T08:41:39.2106436Z }, 2025-07-17T08:41:39.2106746Z { 2025-07-17T08:41:39.2107266Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2107939Z "size": 381, 2025-07-17T08:41:39.2108620Z "digest": "sha256:bef037cce55fd4003ac13320e72dc1e8d781127e52efd4608ef8cc5f7d5a4898" 2025-07-17T08:41:39.2109406Z }, 2025-07-17T08:41:39.2109719Z { 2025-07-17T08:41:39.2110236Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2110920Z "size": 233575, 2025-07-17T08:41:39.2111601Z "digest": "sha256:cc14155fac7b29e7b199a85aef3d20af7112604a2dc86d5dd0a4c89f269d5457" 2025-07-17T08:41:39.2112477Z }, 2025-07-17T08:41:39.2112947Z { 2025-07-17T08:41:39.2113499Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2114217Z "size": 791, 2025-07-17T08:41:39.2114930Z "digest": "sha256:98b40f6827d2be8fb97e8ebec31f1b0211c3efcaf33d3f81ce804245254e0f44" 2025-07-17T08:41:39.2115737Z }, 2025-07-17T08:41:39.2116048Z { 2025-07-17T08:41:39.2116576Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2117255Z "size": 106, 2025-07-17T08:41:39.2117905Z "digest": "sha256:2816c98112963b1630214fb5e27007cd06f8d422a17d61f642a0ca764e305c64" 2025-07-17T08:41:39.2118642Z }, 2025-07-17T08:41:39.2118958Z { 2025-07-17T08:41:39.2119604Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2120315Z "size": 1495, 2025-07-17T08:41:39.2121335Z "digest": "sha256:835aba0facaacf7c3dca5e1f039808b834ec29306e4eabf749c6541fa041ed2d" 2025-07-17T08:41:39.2122132Z }, 2025-07-17T08:41:39.2122451Z { 2025-07-17T08:41:39.2122979Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2123667Z "size": 455280682, 2025-07-17T08:41:39.2124347Z "digest": "sha256:3a7045d55b23c0af9416a558f55fd6b5b7c56060d2fb34178134bd11197aea01" 2025-07-17T08:41:39.2125103Z }, 2025-07-17T08:41:39.2125408Z { 2025-07-17T08:41:39.2125926Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2126602Z "size": 162, 2025-07-17T08:41:39.2127263Z "digest": "sha256:23e65526d2c2ad371238cebf590b9c1a5215668bc35de974fb423e3cb6674ecf" 2025-07-17T08:41:39.2128024Z }, 2025-07-17T08:41:39.2128332Z { 2025-07-17T08:41:39.2128855Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2129534Z "size": 2370, 2025-07-17T08:41:39.2130234Z "digest": "sha256:0bb9b2bb3a77fff33a12e20d8fe9f5c637c339facf000c4407fc1c4515980aa8" 2025-07-17T08:41:39.2131021Z }, 2025-07-17T08:41:39.2131324Z { 2025-07-17T08:41:39.2131844Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2132536Z "size": 7164120706, 2025-07-17T08:41:39.2133240Z "digest": "sha256:89c2ed775d8d10f463a2ed1ab047b356444911b18742cee8b651d57c2cf42ace" 2025-07-17T08:41:39.2134021Z }, 2025-07-17T08:41:39.2134340Z { 2025-07-17T08:41:39.2134887Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2135582Z "size": 105, 2025-07-17T08:41:39.2136262Z "digest": "sha256:22367c15654a15dbf35e3b8360202ebac7bdace0703233bf286a9a4f0d726c59" 2025-07-17T08:41:39.2137025Z }, 2025-07-17T08:41:39.2137337Z { 2025-07-17T08:41:39.2137860Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2138845Z "size": 611, 2025-07-17T08:41:39.2139579Z "digest": "sha256:41dd58eb2af160ecb43af63c54b552825ebfa78a973bd316c44eb1ddfff16d18" 2025-07-17T08:41:39.2140515Z }, 2025-07-17T08:41:39.2140886Z { 2025-07-17T08:41:39.2141461Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2142142Z "size": 677677419, 2025-07-17T08:41:39.2142854Z "digest": "sha256:0c0bf7dba119dca098a5b3c549795939817b9de42b80ee5dbd9f9d6871cdd5da" 2025-07-17T08:41:39.2143633Z }, 2025-07-17T08:41:39.2143944Z { 2025-07-17T08:41:39.2144463Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2145124Z "size": 111, 2025-07-17T08:41:39.2145794Z "digest": "sha256:d49b2c2152c7848476fc7c52aad3b33c65554fcaf9f0c72dd790a0c0ac68a000" 2025-07-17T08:41:39.2146561Z }, 2025-07-17T08:41:39.2146874Z { 2025-07-17T08:41:39.2147400Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2148080Z "size": 1556, 2025-07-17T08:41:39.2148757Z "digest": "sha256:e5f6086238542da1a27d6d92acd14a4320cdca70eedb0b550e883bb0c93d4c7b" 2025-07-17T08:41:39.2149537Z }, 2025-07-17T08:41:39.2149844Z { 2025-07-17T08:41:39.2150413Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2151096Z "size": 107, 2025-07-17T08:41:39.2151754Z "digest": "sha256:f833b0ca2e595948d8dfd847ff60d2125781be4b1216d293329bfe8fa3f88378" 2025-07-17T08:41:39.2152515Z }, 2025-07-17T08:41:39.2152828Z { 2025-07-17T08:41:39.2153352Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2154021Z "size": 166, 2025-07-17T08:41:39.2154660Z "digest": "sha256:ab9877515928d6413a2db6769cf16e7bfafa92ab8967349471de8610df1ee728" 2025-07-17T08:41:39.2155414Z }, 2025-07-17T08:41:39.2155731Z { 2025-07-17T08:41:39.2156247Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2156919Z "size": 2801144, 2025-07-17T08:41:39.2157603Z "digest": "sha256:9f52d3b83f081eb26e37656421cc991942dc5c025ec4ebf8e5312e61f5a9aff2" 2025-07-17T08:41:39.2158361Z }, 2025-07-17T08:41:39.2158926Z { 2025-07-17T08:41:39.2159552Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2160230Z "size": 107, 2025-07-17T08:41:39.2160877Z "digest": "sha256:a03e9776803ba5945e13460659b97af55674b9246fddce732c65215f14ce6846" 2025-07-17T08:41:39.2161618Z }, 2025-07-17T08:41:39.2161926Z { 2025-07-17T08:41:39.2162446Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2163118Z "size": 802, 2025-07-17T08:41:39.2163775Z "digest": "sha256:60bb50caf36b82f9e2c5219ce0415da7571e910f43a183d4f07e1b5f8357159b" 2025-07-17T08:41:39.2164536Z }, 2025-07-17T08:41:39.2164846Z { 2025-07-17T08:41:39.2165361Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2166049Z "size": 26113574, 2025-07-17T08:41:39.2166764Z "digest": "sha256:6fd0a137ed1d7b3abed2c19e7b0013c411d0d0a0642c8a45f7064e704f71c090" 2025-07-17T08:41:39.2167611Z }, 2025-07-17T08:41:39.2167948Z { 2025-07-17T08:41:39.2168525Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2169230Z "size": 104, 2025-07-17T08:41:39.2169941Z "digest": "sha256:865dca06b4bef4059f415bdfb4c535212687fed6f88d9d4202721c50c4dddf43" 2025-07-17T08:41:39.2170729Z }, 2025-07-17T08:41:39.2171050Z { 2025-07-17T08:41:39.2171603Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2172297Z "size": 425, 2025-07-17T08:41:39.2172978Z "digest": "sha256:d2b21c379220ce9e7b232a254b4d6a6c8f5bb93b6c4b32d38146b4f7929d83d4" 2025-07-17T08:41:39.2173752Z }, 2025-07-17T08:41:39.2174071Z { 2025-07-17T08:41:39.2174615Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2175302Z "size": 19308729, 2025-07-17T08:41:39.2175988Z "digest": "sha256:27e06e0e000d0089df33a38f23ba60254494e6a2a47d1a390cf498659527ca4e" 2025-07-17T08:41:39.2177080Z }, 2025-07-17T08:41:39.2177405Z { 2025-07-17T08:41:39.2177944Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2178631Z "size": 425, 2025-07-17T08:41:39.2179314Z "digest": "sha256:e580816fdea358adca6416fddf604089befb1c5e1ebe89954e79d176c0c88b09" 2025-07-17T08:41:39.2180245Z }, 2025-07-17T08:41:39.2180628Z { 2025-07-17T08:41:39.2181256Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2181941Z "size": 724, 2025-07-17T08:41:39.2182604Z "digest": "sha256:d81f985aa40d5111b424e8084dfa90790c3078ab270debfa5b46f37c2423c5ce" 2025-07-17T08:41:39.2183376Z }, 2025-07-17T08:41:39.2183698Z { 2025-07-17T08:41:39.2184236Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2184900Z "size": 141, 2025-07-17T08:41:39.2185566Z "digest": "sha256:42a16ea59e6842dd518eb22d4c479d66b72d202c46406dccb01ea366651dd9cb" 2025-07-17T08:41:39.2186353Z }, 2025-07-17T08:41:39.2186674Z { 2025-07-17T08:41:39.2187206Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2187899Z "size": 137, 2025-07-17T08:41:39.2188582Z "digest": "sha256:e3bdcc2c9a2c6093950eadf56a935f113e3b08f3d6eaa93890df99276389800d" 2025-07-17T08:41:39.2189371Z }, 2025-07-17T08:41:39.2189694Z { 2025-07-17T08:41:39.2190227Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2190922Z "size": 5098803561, 2025-07-17T08:41:39.2191615Z "digest": "sha256:79d9977620eecca9d68daa4ada200d465c7873b097633f8f345ffc781c27f898" 2025-07-17T08:41:39.2192389Z }, 2025-07-17T08:41:39.2192706Z { 2025-07-17T08:41:39.2193244Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2193929Z "size": 193, 2025-07-17T08:41:39.2194606Z "digest": "sha256:9f5a3d4f8993392fe726ea47c2bf39cc61f61256cee93c973fb2cbafc38b7a83" 2025-07-17T08:41:39.2195386Z }, 2025-07-17T08:41:39.2195709Z { 2025-07-17T08:41:39.2196253Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2196945Z "size": 346, 2025-07-17T08:41:39.2197940Z "digest": "sha256:b23e8f7f8b435c3bde38e1b845ee7dc5126fc5d780a924f94f2e509a3c3d1447" 2025-07-17T08:41:39.2198726Z }, 2025-07-17T08:41:39.2199046Z { 2025-07-17T08:41:39.2199667Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2200354Z "size": 88297, 2025-07-17T08:41:39.2201069Z "digest": "sha256:ce8be175f8c65e12653647cf4f8422fb1a80cccd4eccd43709601e635f74a7d5" 2025-07-17T08:41:39.2201857Z }, 2025-07-17T08:41:39.2202180Z { 2025-07-17T08:41:39.2202728Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2203428Z "size": 106, 2025-07-17T08:41:39.2204134Z "digest": "sha256:55fc3af08ef2a75b2a6d92c9471851245d654481b650f63b08c048298af23888" 2025-07-17T08:41:39.2204899Z }, 2025-07-17T08:41:39.2205208Z { 2025-07-17T08:41:39.2205752Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2206448Z "size": 1666, 2025-07-17T08:41:39.2207117Z "digest": "sha256:496c712257d4af1014200db779525259969ca86594f3fc09da71981884f1688d" 2025-07-17T08:41:39.2207874Z }, 2025-07-17T08:41:39.2208198Z { 2025-07-17T08:41:39.2208741Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2209428Z "size": 724, 2025-07-17T08:41:39.2210110Z "digest": "sha256:d81f985aa40d5111b424e8084dfa90790c3078ab270debfa5b46f37c2423c5ce" 2025-07-17T08:41:39.2210884Z }, 2025-07-17T08:41:39.2211197Z { 2025-07-17T08:41:39.2211732Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2212412Z "size": 137, 2025-07-17T08:41:39.2213069Z "digest": "sha256:5277b610601a1d4b1d7671d06690f8c645b403ff8121576e7c9199ecc0aafd81" 2025-07-17T08:41:39.2213828Z }, 2025-07-17T08:41:39.2214153Z { 2025-07-17T08:41:39.2214688Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2215747Z "size": 120, 2025-07-17T08:41:39.2216436Z "digest": "sha256:cdb570d9a441be069ed53d7dd161902340833096b46d848fe9d2c89424de6b66" 2025-07-17T08:41:39.2217212Z }, 2025-07-17T08:41:39.2217537Z { 2025-07-17T08:41:39.2218058Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2218750Z "size": 5345105396, 2025-07-17T08:41:39.2219497Z "digest": "sha256:fe14f59ec5733291d9d3c4a82afdd5f2f387bde7d9f763220a321cc899fee695" 2025-07-17T08:41:39.2220432Z }, 2025-07-17T08:41:39.2220810Z { 2025-07-17T08:41:39.2221408Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2222095Z "size": 175, 2025-07-17T08:41:39.2222757Z "digest": "sha256:219c4a506b1aa90d8462496f8f39585310dac9e294ef62a311f130ec6fef10c8" 2025-07-17T08:41:39.2223523Z }, 2025-07-17T08:41:39.2223854Z { 2025-07-17T08:41:39.2224390Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2225080Z "size": 1897, 2025-07-17T08:41:39.2225829Z "digest": "sha256:0a390bf78ca6647051eae3d59bef1633eef42ce7c2e97b729aa936a4f65cda7b" 2025-07-17T08:41:39.2226632Z }, 2025-07-17T08:41:39.2226961Z { 2025-07-17T08:41:39.2227494Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2228185Z "size": 197488565, 2025-07-17T08:41:39.2228895Z "digest": "sha256:da75f2741b86abc1a64736fba742bed7ada087f6e14c5a74e54183533a05d4ae" 2025-07-17T08:41:39.2229677Z }, 2025-07-17T08:41:39.2230000Z { 2025-07-17T08:41:39.2230530Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2231209Z "size": 302, 2025-07-17T08:41:39.2231856Z "digest": "sha256:648b993dd9d32c5275e385ad268f6cd2de85c0b6538f4806772a8bcb746c6137" 2025-07-17T08:41:39.2232623Z }, 2025-07-17T08:41:39.2232946Z { 2025-07-17T08:41:39.2233482Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2234158Z "size": 32, 2025-07-17T08:41:39.2234850Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-07-17T08:41:39.2235628Z }, 2025-07-17T08:41:39.2236243Z { 2025-07-17T08:41:39.2236796Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2237473Z "size": 108, 2025-07-17T08:41:39.2238152Z "digest": "sha256:32b8cca2c9a35e4af61d2613785bdb322625b0a3db6aaadd82816469f5958c5c" 2025-07-17T08:41:39.2238919Z }, 2025-07-17T08:41:39.2239234Z { 2025-07-17T08:41:39.2239855Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-07-17T08:41:39.2240538Z "size": 54145699, 2025-07-17T08:41:39.2241221Z "digest": "sha256:b8c91749e9f03a297aaa508f21b46757028ef567c6ac02f4acb51a17827d9d69" 2025-07-17T08:41:39.2241988Z } 2025-07-17T08:41:39.2242306Z ] 2025-07-17T08:41:39.2242627Z } 2025-07-17T08:41:39.2242974Z + exit 0 2025-07-17T08:41:39.2293069Z ##[group]Run set -eux 2025-07-17T08:41:39.2293543Z set -eux 2025-07-17T08:41:39.2295070Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin 2025-07-17T08:41:39.2345025Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:39.2345689Z env: 2025-07-17T08:41:39.2346077Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:39.2346784Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:39.2347813Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:39.2348759Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:39.2350412Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:39.2351894Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:39.2352384Z AWS_REGION: us-east-1 2025-07-17T08:41:39.2353460Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:39.2354115Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:39.2363896Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:39.2364358Z ##[endgroup] 2025-07-17T08:41:39.2450832Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-07-17T08:41:39.2452174Z + jq --raw-output .SecretString 2025-07-17T08:41:39.2454488Z + jq -r .docker_hub_readonly_token 2025-07-17T08:41:39.2456391Z + docker login --username pytorchbot --password-stdin 2025-07-17T08:41:39.9298601Z 2025-07-17T08:41:39.9301315Z An error occurred (AccessDeniedException) when calling the GetSecretValue operation: User: arn:aws:sts::308535385114:assumed-role/gha_workflow_s3_and_ecr_read_only/GitHubActions is not authorized to perform: secretsmanager:GetSecretValue on resource: docker_hub_readonly_token because no identity-based policy allows the secretsmanager:GetSecretValue action 2025-07-17T08:41:39.9827766Z Error: Cannot perform an interactive login from a non TTY device 2025-07-17T08:41:39.9863073Z ##[error]Process completed with exit code 1. 2025-07-17T08:41:39.9947130Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-07-17T08:41:39.9947471Z with: 2025-07-17T08:41:39.9947985Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:39.9948616Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:39.9948902Z env: 2025-07-17T08:41:39.9949071Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:39.9949377Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:39.9949825Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:39.9950234Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:39.9950957Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:39.9951608Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:39.9951818Z AWS_REGION: us-east-1 2025-07-17T08:41:39.9952073Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:39.9952341Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:39.9956277Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:39.9956461Z ##[endgroup] 2025-07-17T08:41:39.9968607Z ##[group]Run set -x 2025-07-17T08:41:39.9968827Z set -x 2025-07-17T08:41:39.9969000Z set +e 2025-07-17T08:41:39.9969179Z  2025-07-17T08:41:39.9969380Z login() { 2025-07-17T08:41:39.9969792Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-07-17T08:41:39.9970239Z } 2025-07-17T08:41:39.9970435Z  2025-07-17T08:41:39.9970626Z retry () { 2025-07-17T08:41:39.9970865Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-07-17T08:41:39.9971145Z } 2025-07-17T08:41:39.9971329Z  2025-07-17T08:41:39.9971532Z retry login "${DOCKER_REGISTRY}" 2025-07-17T08:41:39.9971764Z  2025-07-17T08:41:39.9972115Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-07-17T08:41:39.9972585Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-07-17T08:41:39.9972860Z  2025-07-17T08:41:39.9973016Z set -e 2025-07-17T08:41:39.9973265Z # ignore output since only exit code is used for conditional 2025-07-17T08:41:39.9973613Z # only pull docker image if it's not available locally 2025-07-17T08:41:39.9973998Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-07-17T08:41:39.9974355Z  retry docker pull "${DOCKER_IMAGE}" 2025-07-17T08:41:39.9974587Z fi 2025-07-17T08:41:39.9995052Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:39.9995514Z env: 2025-07-17T08:41:39.9995689Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:39.9995996Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:39.9996434Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:39.9996859Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:39.9997575Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:39.9998217Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:39.9998431Z AWS_REGION: us-east-1 2025-07-17T08:41:39.9998675Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:39.9998951Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:40.0002988Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:40.0003671Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:40.0004315Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:40.0004600Z ##[endgroup] 2025-07-17T08:41:40.0046675Z + set +e 2025-07-17T08:41:40.0047315Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:40.0048211Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:40.0051762Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T08:41:40.0052699Z + aws ecr get-login-password --region us-east-1 2025-07-17T08:41:41.4346131Z WARNING! Your password will be stored unencrypted in /home/pytorchci/.docker/config.json. 2025-07-17T08:41:41.4347253Z Configure a credential helper to remove this warning. See 2025-07-17T08:41:41.4348256Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-07-17T08:41:41.4348927Z 2025-07-17T08:41:41.4355234Z Login Succeeded 2025-07-17T08:41:41.4396922Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:41.4398540Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-07-17T08:41:42.9273586Z + IMAGE_SIZE=21749.25635433197 2025-07-17T08:41:42.9274149Z Compressed size of image in MB: 21749.25635433197 2025-07-17T08:41:42.9274769Z + echo 'Compressed size of image in MB: 21749.25635433197' 2025-07-17T08:41:42.9275281Z + set -e 2025-07-17T08:41:42.9276394Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:41:42.9521679Z Prepare all required actions 2025-07-17T08:41:42.9584052Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-07-17T08:41:42.9584528Z with: 2025-07-17T08:41:42.9585168Z github-token: *** 2025-07-17T08:41:42.9585500Z env: 2025-07-17T08:41:42.9585810Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:42.9586398Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:42.9587233Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:42.9587997Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:42.9589343Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:42.9590558Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:42.9590998Z AWS_REGION: us-east-1 2025-07-17T08:41:42.9591592Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:42.9592244Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:42.9602386Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:42.9602833Z ##[endgroup] 2025-07-17T08:41:42.9628499Z ##[group]Run set -eux 2025-07-17T08:41:42.9628977Z set -eux 2025-07-17T08:41:42.9629743Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-07-17T08:41:42.9679067Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:41:42.9679853Z env: 2025-07-17T08:41:42.9680256Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:42.9681115Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:42.9682370Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:42.9683516Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:42.9685509Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:42.9687039Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:42.9687346Z AWS_REGION: us-east-1 2025-07-17T08:41:42.9687652Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:42.9688044Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:42.9693290Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:42.9693651Z GITHUB_TOKEN: *** 2025-07-17T08:41:42.9693875Z ##[endgroup] 2025-07-17T08:41:42.9752936Z + python3 .github/scripts/get_workflow_job_id.py 16337959866 pytorch-rocm-hw-43 2025-07-17T08:41:43.3803684Z Setting output job-id=46159470975 2025-07-17T08:41:43.3804633Z Setting output job-name=rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:41:43.4018889Z Prepare all required actions 2025-07-17T08:41:43.4019308Z Getting action download info 2025-07-17T08:41:43.5849607Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-07-17T08:41:44.2172520Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-07-17T08:41:44.8022107Z ##[group]Run ./.github/actions/download-build-artifacts 2025-07-17T08:41:44.8022676Z with: 2025-07-17T08:41:44.8023062Z name: linux-jammy-rocm-py3.10 2025-07-17T08:41:44.8023568Z s3-bucket: gha-artifacts 2025-07-17T08:41:44.8023966Z env: 2025-07-17T08:41:44.8024309Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:44.8024895Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:44.8025675Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:44.8026456Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:44.8027723Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:44.8028863Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:44.8029235Z AWS_REGION: us-east-1 2025-07-17T08:41:44.8029705Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:44.8030198Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:44.8038334Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:44.8038682Z ##[endgroup] 2025-07-17T08:41:44.8082293Z ##[group]Run seemethere/download-artifact-s3@v4 2025-07-17T08:41:44.8082732Z with: 2025-07-17T08:41:44.8083048Z name: linux-jammy-rocm-py3.10 2025-07-17T08:41:44.8083433Z s3-bucket: gha-artifacts 2025-07-17T08:41:44.8083779Z region: us-east-1 2025-07-17T08:41:44.8084072Z env: 2025-07-17T08:41:44.8084344Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:41:44.8084884Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:41:44.8085654Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:41:44.8086360Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:41:44.8087597Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:41:44.8088722Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:41:44.8089089Z AWS_REGION: us-east-1 2025-07-17T08:41:44.8089842Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:41:44.8090330Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:41:44.8098771Z AWS_SESSION_TOKEN: *** 2025-07-17T08:41:44.8099111Z ##[endgroup] 2025-07-17T08:41:45.2193309Z (node:327377) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-07-17T08:41:45.2194216Z 2025-07-17T08:41:45.2194610Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-07-17T08:41:45.2195564Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-07-17T08:41:45.2196522Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-07-17T08:41:45.4988706Z Found 1 objects with prefix pytorch/pytorch/16337959866/linux-jammy-rocm-py3.10/ 2025-07-17T08:41:45.4989977Z Starting download (1/1): /home/pytorchci/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-07-17T08:43:17.8075341Z Finished download (1/1): /home/pytorchci/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-07-17T08:43:17.8093956Z Artifact download has finished successfully 2025-07-17T08:43:17.8695093Z ##[group]Run unzip -o artifacts.zip 2025-07-17T08:43:17.8695700Z unzip -o artifacts.zip 2025-07-17T08:43:17.8745243Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:17.8745907Z env: 2025-07-17T08:43:17.8746294Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:17.8747530Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:17.8748622Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:17.8749640Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:17.8751315Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:17.8752832Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:17.8753328Z AWS_REGION: us-east-1 2025-07-17T08:43:17.8753947Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:17.8754617Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:17.8764460Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:17.8764925Z ##[endgroup] 2025-07-17T08:43:17.8863572Z Archive: artifacts.zip 2025-07-17T08:43:17.8864114Z creating: dist/ 2025-07-17T08:43:23.0607961Z inflating: dist/torch-2.9.0a0+gita38f433-cp310-cp310-linux_x86_64.whl 2025-07-17T08:43:23.0733970Z inflating: dist/.ninja_log 2025-07-17T08:43:23.0739253Z creating: build/custom_test_artifacts/ 2025-07-17T08:43:23.0740143Z creating: build/custom_test_artifacts/custom-op-build/ 2025-07-17T08:43:23.0741017Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-07-17T08:43:23.0742033Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-07-17T08:43:23.0743421Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-07-17T08:43:23.0744760Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/ 2025-07-17T08:43:23.0746145Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-07-17T08:43:23.0747543Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-07-17T08:43:23.0748869Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-07-17T08:43:23.0749683Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-07-17T08:43:23.0750371Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-07-17T08:43:23.0751010Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-07-17T08:43:23.0751627Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-07-17T08:43:23.0752220Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-07-17T08:43:23.0754005Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-07-17T08:43:23.0754737Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-07-17T08:43:23.0755403Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-07-17T08:43:23.0756135Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-07-17T08:43:23.0756899Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-07-17T08:43:23.0757553Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-07-17T08:43:23.0758083Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-07-17T08:43:23.0758649Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-07-17T08:43:23.0759248Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-07-17T08:43:23.0760187Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-07-17T08:43:23.0760918Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-07-17T08:43:23.0761853Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-07-17T08:43:23.0762517Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-07-17T08:43:23.0763187Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-07-17T08:43:23.0763858Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-07-17T08:43:23.0764527Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-07-17T08:43:23.0765198Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-07-17T08:43:23.0765865Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-07-17T08:43:23.0777558Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-07-17T08:43:23.0953380Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-07-17T08:43:23.0954710Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-07-17T08:43:23.0956065Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-07-17T08:43:23.0957583Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-07-17T08:43:23.0959037Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-07-17T08:43:23.0960623Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-07-17T08:43:23.0962067Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-07-17T08:43:23.0963643Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-07-17T08:43:23.0965294Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-07-17T08:43:23.0966958Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-07-17T08:43:23.0968543Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-07-17T08:43:23.0977231Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-07-17T08:43:23.1052686Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-07-17T08:43:23.1055164Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-07-17T08:43:23.1056821Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-07-17T08:43:23.1058265Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-07-17T08:43:23.1058998Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-07-17T08:43:23.1059584Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-07-17T08:43:23.1060177Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/InstallScripts.json 2025-07-17T08:43:23.1060773Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_outer_vec.cc 2025-07-17T08:43:23.1061344Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_vec_ext.cc 2025-07-17T08:43:23.1061873Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-07-17T08:43:23.1062358Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-07-17T08:43:23.1062855Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-07-17T08:43:23.1209735Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-07-17T08:43:23.1261448Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-07-17T08:43:23.1262531Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-07-17T08:43:23.1263374Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-07-17T08:43:23.1264375Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-07-17T08:43:23.1265745Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-07-17T08:43:23.1266866Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/ 2025-07-17T08:43:23.1267960Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-07-17T08:43:23.1269133Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-07-17T08:43:23.1270280Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-07-17T08:43:23.1271612Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-07-17T08:43:23.1273055Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-07-17T08:43:23.1274532Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-07-17T08:43:23.1275962Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-07-17T08:43:23.1277319Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-07-17T08:43:23.1278785Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-07-17T08:43:23.1279646Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-07-17T08:43:23.1280310Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-07-17T08:43:23.1281020Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-07-17T08:43:23.1309108Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-07-17T08:43:23.1310442Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-07-17T08:43:23.1311453Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-07-17T08:43:23.1312582Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-07-17T08:43:23.1313862Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-07-17T08:43:23.1315981Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-07-17T08:43:23.1317657Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-07-17T08:43:23.1318833Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-07-17T08:43:23.1319656Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-07-17T08:43:23.1320369Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-07-17T08:43:23.1321080Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-07-17T08:43:23.1321778Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-07-17T08:43:23.1322480Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-07-17T08:43:23.1323191Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-07-17T08:43:23.1323954Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-07-17T08:43:23.1362933Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-07-17T08:43:23.1363756Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-07-17T08:43:23.1364468Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-07-17T08:43:23.1365091Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-07-17T08:43:23.1365659Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-07-17T08:43:23.1366209Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-07-17T08:43:23.1366795Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/InstallScripts.json 2025-07-17T08:43:23.1367377Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_outer_vec.cc 2025-07-17T08:43:23.1367935Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_vec_ext.cc 2025-07-17T08:43:23.1369283Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-07-17T08:43:23.1370383Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-07-17T08:43:23.1371589Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-07-17T08:43:23.1405796Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-07-17T08:43:23.1406762Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-07-17T08:43:23.1407719Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-07-17T08:43:23.1408848Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-07-17T08:43:23.1410599Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-07-17T08:43:23.1412134Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/ 2025-07-17T08:43:23.1413528Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-07-17T08:43:23.1414861Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-07-17T08:43:23.1416149Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-07-17T08:43:23.1417621Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-07-17T08:43:23.1419056Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-07-17T08:43:23.1420406Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-07-17T08:43:23.1422602Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-07-17T08:43:23.1423857Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-07-17T08:43:23.1425366Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-07-17T08:43:23.1426856Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-07-17T08:43:23.1428224Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-07-17T08:43:23.1429712Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-07-17T08:43:23.1431293Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-07-17T08:43:23.1432656Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-07-17T08:43:23.1433772Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-07-17T08:43:23.1434913Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-07-17T08:43:23.1436164Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-07-17T08:43:23.1437902Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-07-17T08:43:23.1439645Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-07-17T08:43:23.1441149Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-07-17T08:43:23.1442565Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-07-17T08:43:23.1444036Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-07-17T08:43:23.1445516Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-07-17T08:43:23.1446976Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-07-17T08:43:23.1448447Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-07-17T08:43:23.1449877Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-07-17T08:43:23.1451618Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-07-17T08:43:23.1547434Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-07-17T08:43:23.1549070Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-07-17T08:43:23.1550606Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-07-17T08:43:23.1552192Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-07-17T08:43:23.1553721Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-07-17T08:43:23.1555171Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-07-17T08:43:23.1556607Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-07-17T08:43:23.1558061Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-07-17T08:43:23.1559641Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-07-17T08:43:23.1561540Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-07-17T08:43:23.1562980Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-07-17T08:43:23.1572271Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-07-17T08:43:23.1622905Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-07-17T08:43:23.1624629Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-07-17T08:43:23.1626093Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-07-17T08:43:23.1627432Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-07-17T08:43:23.1628695Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-07-17T08:43:23.1629898Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-07-17T08:43:23.1631158Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/InstallScripts.json 2025-07-17T08:43:23.1632784Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_outer_vec.cc 2025-07-17T08:43:23.1633947Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_vec_ext.cc 2025-07-17T08:43:23.1635014Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-07-17T08:43:23.1635973Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-07-17T08:43:23.1636949Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-07-17T08:43:23.1721448Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-07-17T08:43:23.1756585Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-07-17T08:43:23.1757424Z creating: build/lib/ 2025-07-17T08:43:23.1831126Z inflating: build/lib/libprotobuf-lite.a 2025-07-17T08:43:23.2230560Z inflating: build/lib/libprotobuf.a 2025-07-17T08:43:23.2676077Z inflating: build/lib/libprotoc.a 2025-07-17T08:43:23.2684744Z inflating: build/lib/libpthreadpool.a 2025-07-17T08:43:23.2692122Z inflating: build/lib/libcpuinfo.a 2025-07-17T08:43:23.2699064Z inflating: build/lib/libcpuinfo_internals.a 2025-07-17T08:43:23.2699811Z inflating: build/lib/libclog.a 2025-07-17T08:43:23.2716617Z inflating: build/lib/libpytorch_qnnpack.a 2025-07-17T08:43:23.2718006Z inflating: build/lib/libnnpack_reference_layers.a 2025-07-17T08:43:23.2734409Z inflating: build/lib/libnnpack.a 2025-07-17T08:43:23.2900923Z inflating: build/lib/libmicrokernels-prod.a 2025-07-17T08:43:23.3682282Z inflating: build/lib/libmicrokernels-all.a 2025-07-17T08:43:23.3745554Z inflating: build/lib/libgtest.a 2025-07-17T08:43:23.3760915Z inflating: build/lib/libgmock.a 2025-07-17T08:43:23.3761607Z inflating: build/lib/libgtest_main.a 2025-07-17T08:43:23.3762199Z inflating: build/lib/libgmock_main.a 2025-07-17T08:43:23.3842918Z inflating: build/lib/libXNNPACK.a 2025-07-17T08:43:23.3910308Z inflating: build/lib/libbenchmark.a 2025-07-17T08:43:23.3911042Z inflating: build/lib/libbenchmark_main.a 2025-07-17T08:43:23.3911664Z inflating: build/lib/libjitprofiling.a 2025-07-17T08:43:23.3918743Z inflating: build/lib/libittnotify.a 2025-07-17T08:43:23.3976261Z inflating: build/lib/libasmjit.a 2025-07-17T08:43:23.5062111Z inflating: build/lib/libfbgemm.a 2025-07-17T08:43:23.5086440Z inflating: build/lib/libtensorpipe_uv.a 2025-07-17T08:43:23.5589008Z inflating: build/lib/libtensorpipe.a 2025-07-17T08:43:23.5697368Z inflating: build/lib/libgloo.a 2025-07-17T08:43:23.5740310Z inflating: build/lib/libonnx_proto.a 2025-07-17T08:43:23.6135188Z inflating: build/lib/libgloo_hip.a 2025-07-17T08:43:23.6785234Z inflating: build/lib/libonnx.a 2025-07-17T08:43:24.6118916Z inflating: build/lib/libdnnl.a 2025-07-17T08:43:24.6135471Z inflating: build/lib/libfmt.a 2025-07-17T08:43:24.6394426Z inflating: build/lib/libkineto.a 2025-07-17T08:43:24.6489671Z inflating: build/lib/libc10.so 2025-07-17T08:43:24.6490441Z inflating: build/lib/libtorch_global_deps.so 2025-07-17T08:43:24.6536527Z inflating: build/lib/libc10_hip.so 2025-07-17T08:43:24.6537226Z inflating: build/lib/libcaffe2_nvrtc.so 2025-07-17T08:43:27.6533110Z inflating: build/lib/libtorch_cpu.so 2025-07-17T08:43:27.6535830Z inflating: build/lib/libshm.so 2025-07-17T08:43:28.4146639Z inflating: build/lib/libtorch_hip.so 2025-07-17T08:43:28.4147355Z inflating: build/lib/libtorch.so 2025-07-17T08:43:28.4164176Z inflating: build/lib/libjitbackend_test.so 2025-07-17T08:43:28.4185498Z inflating: build/lib/libbackend_with_compiler.so 2025-07-17T08:43:28.4247960Z inflating: build/lib/libtorchbind_test.so 2025-07-17T08:43:28.4271503Z inflating: build/lib/libaoti_custom_ops.so 2025-07-17T08:43:28.6094693Z inflating: build/lib/libtorch_python.so 2025-07-17T08:43:28.6124464Z inflating: build/lib/libnnapi_backend.so 2025-07-17T08:43:28.6125143Z creating: build/bin/ 2025-07-17T08:43:28.6126332Z creating: build/bin/CMakeFiles/ 2025-07-17T08:43:28.6126959Z inflating: build/bin/cmake_install.cmake 2025-07-17T08:43:28.6127581Z inflating: build/bin/CTestTestfile.cmake 2025-07-17T08:43:28.6529660Z inflating: build/bin/protoc-3.13.0.0 2025-07-17T08:43:28.6933056Z inflating: build/bin/protoc 2025-07-17T08:43:28.6982002Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-07-17T08:43:28.7032307Z inflating: build/bin/c10_DeviceGuard_test 2025-07-17T08:43:28.7082987Z inflating: build/bin/c10_Device_test 2025-07-17T08:43:28.7140580Z inflating: build/bin/c10_DispatchKeySet_test 2025-07-17T08:43:28.7193306Z inflating: build/bin/c10_Scalar_test 2025-07-17T08:43:28.7241466Z inflating: build/bin/c10_StreamGuard_test 2025-07-17T08:43:28.7291369Z inflating: build/bin/c10_SymInt_test 2025-07-17T08:43:28.7344541Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-07-17T08:43:28.7392496Z inflating: build/bin/c10_ConstexprCrc_test 2025-07-17T08:43:28.7447290Z inflating: build/bin/c10_InlineStreamGuard_test 2025-07-17T08:43:28.7495847Z inflating: build/bin/c10_ArrayRef_test 2025-07-17T08:43:28.7550486Z inflating: build/bin/c10_SizesAndStrides_test 2025-07-17T08:43:28.7602611Z inflating: build/bin/c10_Bitset_test 2025-07-17T08:43:28.7670163Z inflating: build/bin/c10_cow_test 2025-07-17T08:43:28.7725855Z inflating: build/bin/c10_Enumerate_test 2025-07-17T08:43:28.7774629Z inflating: build/bin/c10_DeadlockDetection_test 2025-07-17T08:43:28.7824309Z inflating: build/bin/c10_Half_test 2025-07-17T08:43:28.7875950Z inflating: build/bin/c10_IntrusiveList_test 2025-07-17T08:43:28.7930417Z inflating: build/bin/c10_LeftRight_test 2025-07-17T08:43:28.7984721Z inflating: build/bin/c10_Metaprogramming_test 2025-07-17T08:43:28.8036729Z inflating: build/bin/c10_NetworkFlow_test 2025-07-17T08:43:28.8086165Z inflating: build/bin/c10_Semaphore_test 2025-07-17T08:43:28.8140385Z inflating: build/bin/c10_ThreadLocal_test 2025-07-17T08:43:28.8191139Z inflating: build/bin/c10_TypeIndex_test 2025-07-17T08:43:28.8240297Z inflating: build/bin/c10_Synchronized_test 2025-07-17T08:43:28.8290044Z inflating: build/bin/c10_TypeList_test 2025-07-17T08:43:28.8338174Z inflating: build/bin/c10_TypeTraits_test 2025-07-17T08:43:28.8388708Z inflating: build/bin/c10_accumulate_test 2025-07-17T08:43:28.8444063Z inflating: build/bin/c10_complex_math_test 2025-07-17T08:43:28.8493908Z inflating: build/bin/c10_bit_cast_test 2025-07-17T08:43:28.8548615Z inflating: build/bin/c10_bfloat16_test 2025-07-17T08:43:28.8597071Z inflating: build/bin/c10_error_test 2025-07-17T08:43:28.8650977Z inflating: build/bin/c10_complex_test 2025-07-17T08:43:28.8702839Z inflating: build/bin/c10_exception_test 2025-07-17T08:43:28.8752072Z inflating: build/bin/c10_flags_test 2025-07-17T08:43:28.8801439Z inflating: build/bin/c10_generic_math_test 2025-07-17T08:43:28.8851254Z inflating: build/bin/c10_irange_test 2025-07-17T08:43:28.9007354Z inflating: build/bin/c10_intrusive_ptr_test 2025-07-17T08:43:28.9059515Z inflating: build/bin/c10_lazy_test 2025-07-17T08:43:28.9115115Z inflating: build/bin/c10_logging_test 2025-07-17T08:43:28.9187460Z inflating: build/bin/c10_optional_test 2025-07-17T08:43:28.9240288Z inflating: build/bin/c10_registry_test 2025-07-17T08:43:28.9300219Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-07-17T08:43:28.9447430Z inflating: build/bin/c10_small_vector_test 2025-07-17T08:43:28.9497854Z inflating: build/bin/c10_ssize_test 2025-07-17T08:43:28.9552895Z inflating: build/bin/c10_string_util_test 2025-07-17T08:43:28.9601036Z inflating: build/bin/c10_string_view_test 2025-07-17T08:43:28.9650138Z inflating: build/bin/c10_tempfile_test 2025-07-17T08:43:28.9692848Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-07-17T08:43:28.9747659Z inflating: build/bin/c10_typeid_test 2025-07-17T08:43:28.9796319Z inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test 2025-07-17T08:43:28.9844994Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream 2025-07-17T08:43:28.9892692Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2025-07-17T08:43:28.9940723Z inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes 2025-07-17T08:43:28.9988954Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2025-07-17T08:43:29.0036902Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2025-07-17T08:43:29.0085071Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2025-07-17T08:43:29.0133116Z inflating: build/bin/c10_hip_HIPTest 2025-07-17T08:43:29.1152616Z inflating: build/bin/vec_test_all_types_AVX512 2025-07-17T08:43:29.2158496Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-07-17T08:43:29.3182891Z inflating: build/bin/vec_test_all_types_AVX2 2025-07-17T08:43:29.3234265Z inflating: build/bin/BackoffTest 2025-07-17T08:43:29.3286227Z inflating: build/bin/FileStoreTest 2025-07-17T08:43:29.3341927Z inflating: build/bin/TCPStoreTest 2025-07-17T08:43:29.3394481Z inflating: build/bin/HashStoreTest 2025-07-17T08:43:29.3458341Z inflating: build/bin/ProcessGroupGlooTest 2025-07-17T08:43:29.3460034Z inflating: build/bin/example_allreduce 2025-07-17T08:43:29.3464330Z inflating: build/bin/torch_shm_manager 2025-07-17T08:43:29.3516359Z inflating: build/bin/static_runtime_bench 2025-07-17T08:43:29.3587757Z inflating: build/bin/Dict_test 2025-07-17T08:43:29.3639018Z inflating: build/bin/Dimname_test 2025-07-17T08:43:29.3702324Z inflating: build/bin/MaybeOwned_test 2025-07-17T08:43:29.3757753Z inflating: build/bin/NamedTensor_test 2025-07-17T08:43:29.3814806Z inflating: build/bin/apply_utils_test 2025-07-17T08:43:29.3871994Z inflating: build/bin/atest 2025-07-17T08:43:29.3926305Z inflating: build/bin/broadcast_test 2025-07-17T08:43:29.3976093Z inflating: build/bin/cpu_allocator_test 2025-07-17T08:43:29.4037177Z inflating: build/bin/basic 2025-07-17T08:43:29.4093684Z inflating: build/bin/cpu_generator_test 2025-07-17T08:43:29.4145371Z inflating: build/bin/cpu_profiling_allocator_test 2025-07-17T08:43:29.4380027Z inflating: build/bin/static_runtime_test 2025-07-17T08:43:29.4430026Z inflating: build/bin/dlconvertor_test 2025-07-17T08:43:29.4485736Z inflating: build/bin/extension_backend_test 2025-07-17T08:43:29.4540200Z inflating: build/bin/half_test 2025-07-17T08:43:29.4588940Z inflating: build/bin/lazy_tensor_test 2025-07-17T08:43:29.4641280Z inflating: build/bin/memory_format_test 2025-07-17T08:43:29.4693517Z inflating: build/bin/math_kernel_test 2025-07-17T08:43:29.4745901Z inflating: build/bin/memory_overlapping_test 2025-07-17T08:43:29.4795707Z inflating: build/bin/operator_name_test 2025-07-17T08:43:29.4845469Z inflating: build/bin/operators_test 2025-07-17T08:43:29.4936658Z inflating: build/bin/ivalue_test 2025-07-17T08:43:29.4987512Z inflating: build/bin/packedtensoraccessor_test 2025-07-17T08:43:29.5042062Z inflating: build/bin/native_test 2025-07-17T08:43:29.5130467Z inflating: build/bin/cpu_rng_test 2025-07-17T08:43:29.5182498Z inflating: build/bin/mobile_memory_cleanup 2025-07-17T08:43:29.5238224Z inflating: build/bin/quantized_test 2025-07-17T08:43:29.5288257Z inflating: build/bin/reportMemoryUsage_test 2025-07-17T08:43:29.5353015Z inflating: build/bin/pow_test 2025-07-17T08:43:29.5402205Z inflating: build/bin/reduce_ops_test 2025-07-17T08:43:29.5457123Z inflating: build/bin/scalar_tensor_test 2025-07-17T08:43:29.5507704Z inflating: build/bin/StorageUtils_test 2025-07-17T08:43:29.5565077Z inflating: build/bin/scalar_test 2025-07-17T08:43:29.5618845Z inflating: build/bin/type_ptr_test 2025-07-17T08:43:29.5669734Z inflating: build/bin/stride_properties_test 2025-07-17T08:43:29.5671276Z inflating: build/bin/thread_init_test 2025-07-17T08:43:29.5724625Z inflating: build/bin/undefined_tensor_test 2025-07-17T08:43:29.5777129Z inflating: build/bin/test_parallel 2025-07-17T08:43:29.5777845Z inflating: build/bin/verify_api_visibility 2025-07-17T08:43:29.5854596Z inflating: build/bin/tensor_iterator_test 2025-07-17T08:43:29.5912696Z inflating: build/bin/IListRef_test 2025-07-17T08:43:29.6014338Z inflating: build/bin/List_test 2025-07-17T08:43:29.6064643Z inflating: build/bin/weakref_test 2025-07-17T08:43:29.6115033Z inflating: build/bin/xla_tensor_test 2025-07-17T08:43:29.6165366Z inflating: build/bin/wrapdim_test 2025-07-17T08:43:29.6256065Z inflating: build/bin/kernel_function_test 2025-07-17T08:43:29.6370277Z inflating: build/bin/kernel_function_legacy_test 2025-07-17T08:43:29.6489352Z inflating: build/bin/kernel_lambda_legacy_test 2025-07-17T08:43:29.6547826Z inflating: build/bin/kernel_stackbased_test 2025-07-17T08:43:29.6644624Z inflating: build/bin/kernel_lambda_test 2025-07-17T08:43:29.6735011Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-07-17T08:43:29.6785080Z inflating: build/bin/CppSignature_test 2025-07-17T08:43:29.6833402Z inflating: build/bin/op_allowlist_test 2025-07-17T08:43:29.6891752Z inflating: build/bin/type_test 2025-07-17T08:43:29.7182580Z inflating: build/bin/op_registration_test 2025-07-17T08:43:29.7249950Z inflating: build/bin/legacy_vmap_test 2025-07-17T08:43:29.7314429Z inflating: build/bin/inline_container_test 2025-07-17T08:43:29.7363023Z inflating: build/bin/hip_complex_math_test 2025-07-17T08:43:29.7414648Z inflating: build/bin/hip_apply_test 2025-07-17T08:43:29.7462693Z inflating: build/bin/hip_complex_test 2025-07-17T08:43:29.7510879Z inflating: build/bin/hip_distributions_test 2025-07-17T08:43:29.7558774Z inflating: build/bin/hip_generator_test 2025-07-17T08:43:29.7606863Z inflating: build/bin/hip_half_test 2025-07-17T08:43:29.7655164Z inflating: build/bin/hip_integer_divider_test 2025-07-17T08:43:29.7703160Z inflating: build/bin/hip_optional_test 2025-07-17T08:43:29.7751889Z inflating: build/bin/hip_packedtensoraccessor_test 2025-07-17T08:43:29.7800131Z inflating: build/bin/hip_vectorized_test 2025-07-17T08:43:29.7863903Z inflating: build/bin/KernelFunction_test 2025-07-17T08:43:29.7917490Z inflating: build/bin/backend_fallback_test 2025-07-17T08:43:29.7968938Z inflating: build/bin/hip_dlconvertor_test 2025-07-17T08:43:29.8969625Z inflating: build/bin/test_jit 2025-07-17T08:43:29.9227555Z inflating: build/bin/test_nativert 2025-07-17T08:43:29.9239057Z inflating: build/bin/tutorial_tensorexpr 2025-07-17T08:43:29.9292517Z inflating: build/bin/test_dist_autograd 2025-07-17T08:43:29.9357466Z inflating: build/bin/test_cpp_rpc 2025-07-17T08:43:30.0076478Z inflating: build/bin/test_tensorexpr 2025-07-17T08:43:30.0077884Z inflating: build/bin/parallel_benchmark 2025-07-17T08:43:30.0143437Z inflating: build/bin/test_mobile_nnc 2025-07-17T08:43:30.1210786Z inflating: build/bin/test_api 2025-07-17T08:43:30.1218621Z inflating: build/bin/aot_model_compiler_test 2025-07-17T08:43:30.1541106Z inflating: build/bin/test_lazy 2025-07-17T08:43:30.1541777Z creating: .additional_ci_files/ 2025-07-17T08:43:30.1685690Z inflating: .additional_ci_files/test-times.json 2025-07-17T08:43:30.2237020Z inflating: .additional_ci_files/test-class-times.json 2025-07-17T08:43:30.2285619Z ##[group]Run rm artifacts.zip 2025-07-17T08:43:30.2285918Z rm artifacts.zip 2025-07-17T08:43:30.2330818Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:30.2331216Z env: 2025-07-17T08:43:30.2331423Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:30.2331804Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:30.2332349Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:30.2333229Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:30.2334112Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:30.2334912Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:30.2335178Z AWS_REGION: us-east-1 2025-07-17T08:43:30.2335532Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:30.2335967Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:30.2340920Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:30.2341169Z ##[endgroup] 2025-07-17T08:43:30.5472818Z ##[group]Run df -H 2025-07-17T08:43:30.5473283Z df -H 2025-07-17T08:43:30.5526346Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:30.5527149Z env: 2025-07-17T08:43:30.5527625Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:30.5528477Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:30.5529760Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:30.5530945Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:30.5532084Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:30.5532862Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:30.5533121Z AWS_REGION: us-east-1 2025-07-17T08:43:30.5533426Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:30.5533783Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:30.5540424Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:30.5540859Z ##[endgroup] 2025-07-17T08:43:30.5629191Z Filesystem Size Used Avail Use% Mounted on 2025-07-17T08:43:30.5629969Z tmpfs 14G 18M 14G 1% /run 2025-07-17T08:43:30.5630725Z /dev/mapper/ubuntu--vg-ubuntu--lv 1.9T 718G 1.1T 41% / 2025-07-17T08:43:30.5631495Z tmpfs 68G 8.2k 68G 1% /dev/shm 2025-07-17T08:43:30.5632171Z tmpfs 5.3M 0 5.3M 0% /run/lock 2025-07-17T08:43:30.5632863Z /dev/sda2 2.1G 335M 1.6G 18% /boot 2025-07-17T08:43:30.5633983Z /dev/sda1 1.2G 6.4M 1.2G 1% /boot/efi 2025-07-17T08:43:30.5634829Z tmpfs 14G 17k 14G 1% /run/user/1001 2025-07-17T08:43:30.5677446Z Prepare all required actions 2025-07-17T08:43:30.5678135Z Getting action download info 2025-07-17T08:43:30.9455522Z ##[group]Run ./.github/actions/download-td-artifacts 2025-07-17T08:43:30.9456167Z with: 2025-07-17T08:43:30.9456543Z env: 2025-07-17T08:43:30.9456919Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:30.9457670Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:30.9458711Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:30.9459677Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:30.9461345Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:30.9462858Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:30.9463364Z AWS_REGION: us-east-1 2025-07-17T08:43:30.9464021Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:30.9464681Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:30.9474304Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:30.9474795Z ##[endgroup] 2025-07-17T08:43:30.9527911Z ##[group]Run seemethere/download-artifact-s3@v4 2025-07-17T08:43:30.9528507Z with: 2025-07-17T08:43:30.9528870Z name: td_results 2025-07-17T08:43:30.9529296Z s3-bucket: gha-artifacts 2025-07-17T08:43:30.9529764Z region: us-east-1 2025-07-17T08:43:30.9530162Z env: 2025-07-17T08:43:30.9530544Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:30.9531271Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:30.9532341Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:30.9533347Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:30.9535060Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:30.9536592Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:30.9537090Z AWS_REGION: us-east-1 2025-07-17T08:43:30.9537643Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:30.9538291Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:30.9548096Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:30.9548557Z ##[endgroup] 2025-07-17T08:43:31.3576460Z (node:327475) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-07-17T08:43:31.3577380Z 2025-07-17T08:43:31.3577750Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-07-17T08:43:31.3578686Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-07-17T08:43:31.3579625Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-07-17T08:43:31.6286510Z Found 0 objects with prefix pytorch/pytorch/16337959866/td_results/ 2025-07-17T08:43:31.6293705Z Artifact download has finished successfully 2025-07-17T08:43:31.6581285Z ##[group]Run mkdir -p .additional_ci_files 2025-07-17T08:43:31.6581958Z mkdir -p .additional_ci_files 2025-07-17T08:43:31.6582745Z mv td_results.json .additional_ci_files/td_results.json || true 2025-07-17T08:43:31.6631685Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:31.6632376Z env: 2025-07-17T08:43:31.6632767Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:31.6633520Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:31.6634600Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:31.6635622Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:31.6637338Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:31.6638863Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:31.6639946Z AWS_REGION: us-east-1 2025-07-17T08:43:31.6640585Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:31.6641253Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:31.6652296Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:31.6652845Z ##[endgroup] 2025-07-17T08:43:31.6768328Z mv: cannot stat 'td_results.json': No such file or directory 2025-07-17T08:43:31.6803914Z ##[group]Run .github/scripts/parse_ref.py 2025-07-17T08:43:31.6804304Z .github/scripts/parse_ref.py 2025-07-17T08:43:31.6836297Z shell: /usr/bin/bash -e {0} 2025-07-17T08:43:31.6836833Z env: 2025-07-17T08:43:31.6837252Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:31.6838011Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:31.6839111Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:31.6840285Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:31.6842018Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:31.6843602Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:31.6844110Z AWS_REGION: us-east-1 2025-07-17T08:43:31.6844692Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:31.6845418Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:31.6854747Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:31.6854996Z ##[endgroup] 2025-07-17T08:43:31.7078128Z Setting output branch=main 2025-07-17T08:43:31.7260327Z Prepare all required actions 2025-07-17T08:43:31.7261059Z Getting action download info 2025-07-17T08:43:31.9306401Z ##[group]Run ./.github/actions/filter-test-configs 2025-07-17T08:43:31.9307046Z with: 2025-07-17T08:43:31.9307761Z github-token: *** 2025-07-17T08:43:31.9308988Z test-matrix: {"include": [{"config": "inductor", "shard": 1, "num_shards": 2, "runner": "linux.rocm.gpu.2"}, {"config": "inductor", "shard": 2, "num_shards": 2, "runner": "linux.rocm.gpu.2"}]} 2025-07-17T08:43:31.9310489Z job-name: rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:43:31.9311254Z env: 2025-07-17T08:43:31.9311639Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:31.9312361Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:31.9313398Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:31.9314346Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:31.9316064Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:31.9317557Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:31.9318046Z AWS_REGION: us-east-1 2025-07-17T08:43:31.9318603Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:31.9319292Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:31.9328284Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:31.9328517Z ##[endgroup] 2025-07-17T08:43:31.9383193Z ##[group]Run nick-fields/retry@v3.0.0 2025-07-17T08:43:31.9383740Z with: 2025-07-17T08:43:31.9384099Z shell: bash 2025-07-17T08:43:31.9384494Z timeout_minutes: 10 2025-07-17T08:43:31.9384916Z max_attempts: 5 2025-07-17T08:43:31.9385322Z retry_wait_seconds: 30 2025-07-17T08:43:31.9386683Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2025-07-17T08:43:31.9388089Z polling_interval_seconds: 1 2025-07-17T08:43:31.9388589Z warning_on_retry: true 2025-07-17T08:43:31.9389030Z continue_on_error: false 2025-07-17T08:43:31.9389476Z env: 2025-07-17T08:43:31.9389850Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:31.9390565Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:31.9391635Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:31.9392608Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:31.9394699Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:31.9396215Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:31.9396711Z AWS_REGION: us-east-1 2025-07-17T08:43:31.9397248Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:31.9397924Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:31.9407600Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:31.9407949Z GITHUB_TOKEN: *** 2025-07-17T08:43:31.9408179Z ##[endgroup] 2025-07-17T08:43:32.0074341Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2025-07-17T08:43:32.2620661Z Defaulting to user installation because normal site-packages is not writeable 2025-07-17T08:43:32.3411634Z Requirement already satisfied: requests==2.27.1 in /home/pytorchci/.local/lib/python3.10/site-packages (2.27.1) 2025-07-17T08:43:32.3416313Z Requirement already satisfied: pyyaml==6.0.1 in /home/pytorchci/.local/lib/python3.10/site-packages (6.0.1) 2025-07-17T08:43:32.3510817Z Requirement already satisfied: charset-normalizer~=2.0.0 in /home/pytorchci/.local/lib/python3.10/site-packages (from requests==2.27.1) (2.0.12) 2025-07-17T08:43:32.3520489Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests==2.27.1) (3.3) 2025-07-17T08:43:32.3525926Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests==2.27.1) (1.26.5) 2025-07-17T08:43:32.3530116Z Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests==2.27.1) (2020.6.20) 2025-07-17T08:43:33.0077166Z Command completed after 1 attempt(s). 2025-07-17T08:43:33.0206690Z ##[group]Run set -x 2025-07-17T08:43:33.0207341Z set -x 2025-07-17T08:43:33.0207740Z  2025-07-17T08:43:33.0208416Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-07-17T08:43:33.0209259Z # in runner workspace 2025-07-17T08:43:33.0209972Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-07-17T08:43:33.0260810Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:33.0261492Z env: 2025-07-17T08:43:33.0261883Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:33.0262617Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:33.0263693Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:33.0264668Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:33.0266357Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:33.0267892Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:33.0268383Z AWS_REGION: us-east-1 2025-07-17T08:43:33.0268944Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:33.0269582Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:33.0279253Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:33.0279801Z ##[endgroup] 2025-07-17T08:43:33.0362351Z + python3 /home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-07-17T08:43:33.0528993Z Setting output branch=main 2025-07-17T08:43:33.0589804Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-07-17T08:43:33.0590526Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-07-17T08:43:33.0591122Z echo "Job name: ${JOB_NAME}" 2025-07-17T08:43:33.0591628Z  2025-07-17T08:43:33.0592288Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-07-17T08:43:33.0593097Z # in runner workspace 2025-07-17T08:43:33.0593835Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-07-17T08:43:33.0594666Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-07-17T08:43:33.0595232Z  --job-name "${JOB_NAME}" \ 2025-07-17T08:43:33.0597107Z  --test-matrix "{"include": [{"config": "inductor", "shard": 1, "num_shards": 2, "runner": "linux.rocm.gpu.2"}, {"config": "inductor", "shard": 2, "num_shards": 2, "runner": "linux.rocm.gpu.2"}]}" \ 2025-07-17T08:43:33.0598477Z  --selected-test-configs "" \ 2025-07-17T08:43:33.0599067Z  --pr-number "${PR_NUMBER}" \ 2025-07-17T08:43:33.0599733Z  --tag "${TAG}" \ 2025-07-17T08:43:33.0600230Z  --event-name "${EVENT_NAME}" \ 2025-07-17T08:43:33.0600778Z  --schedule "${SCHEDULE}" \ 2025-07-17T08:43:33.0601313Z  --branch "${HEAD_BRANCH}" 2025-07-17T08:43:33.0649579Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:33.0650361Z env: 2025-07-17T08:43:33.0650829Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:33.0651649Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:33.0652303Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:33.0652821Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:33.0653697Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:33.0654681Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:33.0654954Z AWS_REGION: us-east-1 2025-07-17T08:43:33.0655273Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:33.0655609Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:33.0663781Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:33.0664501Z GITHUB_TOKEN: *** 2025-07-17T08:43:33.0665158Z JOB_NAME: rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:43:33.0665897Z PR_NUMBER: 2025-07-17T08:43:33.0666279Z TAG: 2025-07-17T08:43:33.0666661Z EVENT_NAME: push 2025-07-17T08:43:33.0667062Z SCHEDULE: 2025-07-17T08:43:33.0667442Z HEAD_BRANCH: main 2025-07-17T08:43:33.0667850Z ##[endgroup] 2025-07-17T08:43:33.0746398Z Workflow: inductor-rocm 2025-07-17T08:43:33.0747220Z Job name: rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:43:33.6505483Z Setting output keep-going=True 2025-07-17T08:43:33.6506433Z Setting output ci-verbose-test-logs=False 2025-07-17T08:43:33.6507330Z Setting output ci-test-showlocals=False 2025-07-17T08:43:33.6508232Z Setting output ci-no-test-timeout=False 2025-07-17T08:43:33.6509005Z Setting output ci-no-td=False 2025-07-17T08:43:33.6509740Z Setting output ci-td-distributed=False 2025-07-17T08:43:33.6510597Z Setting output is-unstable=False 2025-07-17T08:43:33.6511366Z Setting output reenabled-issues= 2025-07-17T08:43:33.6538523Z Setting output test-matrix={"include": [{"config": "inductor", "shard": 1, "num_shards": 2, "runner": "linux.rocm.gpu.2"}, {"config": "inductor", "shard": 2, "num_shards": 2, "runner": "linux.rocm.gpu.2"}]} 2025-07-17T08:43:33.6540413Z Setting output is-test-matrix-empty=False 2025-07-17T08:43:33.6750861Z ##[group]Run echo "Filtered matrix:" 2025-07-17T08:43:33.6751493Z echo "Filtered matrix:" 2025-07-17T08:43:33.6752792Z echo "{"include": [{"config": "inductor", "shard": 1, "num_shards": 2, "runner": "linux.rocm.gpu.2"}, {"config": "inductor", "shard": 2, "num_shards": 2, "runner": "linux.rocm.gpu.2"}]}" 2025-07-17T08:43:33.6754116Z  2025-07-17T08:43:33.6754498Z echo 2025-07-17T08:43:33.6754990Z echo "Is the current job unstable? False" 2025-07-17T08:43:33.6755577Z  2025-07-17T08:43:33.6755931Z echo 2025-07-17T08:43:33.6756379Z echo "Is keep-going label set? True" 2025-07-17T08:43:33.6756926Z  2025-07-17T08:43:33.6757292Z echo 2025-07-17T08:43:33.6757708Z echo "Reenabled issues? " 2025-07-17T08:43:33.6802187Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:33.6802865Z env: 2025-07-17T08:43:33.6803259Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:33.6804422Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:33.6805475Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:33.6806589Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:33.6808594Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:33.6810388Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:33.6810978Z AWS_REGION: us-east-1 2025-07-17T08:43:33.6811658Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:33.6812183Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:33.6817184Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:33.6817428Z ##[endgroup] 2025-07-17T08:43:33.6893646Z Filtered matrix: 2025-07-17T08:43:33.6894296Z {include: [{config: inductor, shard: 1, num_shards: 2, runner: linux.rocm.gpu.2}, {config: inductor, shard: 2, num_shards: 2, runner: linux.rocm.gpu.2}]} 2025-07-17T08:43:33.6894868Z 2025-07-17T08:43:33.6895158Z Is the current job unstable? False 2025-07-17T08:43:33.6895410Z 2025-07-17T08:43:33.6895523Z Is keep-going label set? True 2025-07-17T08:43:33.6895694Z 2025-07-17T08:43:33.6895791Z Reenabled issues? 2025-07-17T08:43:33.6961882Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-07-17T08:43:33.6962815Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-07-17T08:43:33.7010384Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T08:43:33.7011182Z env: 2025-07-17T08:43:33.7011645Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:33.7012517Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:33.7013768Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:33.7014857Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:33.7016562Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:33.7018089Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:33.7018625Z AWS_REGION: us-east-1 2025-07-17T08:43:33.7019178Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:33.7019896Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:33.7029947Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:33.7030387Z JOB_TIMEOUT: 300 2025-07-17T08:43:33.7030783Z ##[endgroup] 2025-07-17T08:43:33.7180736Z ##[group]Run set -x 2025-07-17T08:43:33.7181326Z set -x 2025-07-17T08:43:33.7181738Z  2025-07-17T08:43:33.7182195Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-07-17T08:43:33.7182887Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-07-17T08:43:33.7183568Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-07-17T08:43:33.7184190Z  TEST_COMMAND=.ci/caffe2/test.sh 2025-07-17T08:43:33.7184756Z else 2025-07-17T08:43:33.7185196Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-07-17T08:43:33.7185721Z fi 2025-07-17T08:43:33.7186079Z  2025-07-17T08:43:33.7186684Z # detached container should get cleaned up by teardown_ec2_linux 2025-07-17T08:43:33.7187682Z # TODO: Stop building test binaries as part of the build phase 2025-07-17T08:43:33.7188536Z # Used for GPU_FLAG since that doesn't play nice 2025-07-17T08:43:33.7189282Z # shellcheck disable=SC2086,SC2090 2025-07-17T08:43:33.7189878Z container_name=$(docker run \ 2025-07-17T08:43:33.7190423Z  ${GPU_FLAG:-} \ 2025-07-17T08:43:33.7190929Z  -e BUILD_ENVIRONMENT \ 2025-07-17T08:43:33.7191459Z  -e PR_NUMBER \ 2025-07-17T08:43:33.7191932Z  -e GITHUB_ACTIONS \ 2025-07-17T08:43:33.7192436Z  -e GITHUB_REPOSITORY \ 2025-07-17T08:43:33.7192950Z  -e GITHUB_WORKFLOW \ 2025-07-17T08:43:33.7193869Z  -e GITHUB_JOB \ 2025-07-17T08:43:33.7194341Z  -e GITHUB_RUN_ID \ 2025-07-17T08:43:33.7194826Z  -e GITHUB_RUN_NUMBER \ 2025-07-17T08:43:33.7195337Z  -e GITHUB_RUN_ATTEMPT \ 2025-07-17T08:43:33.7195835Z  -e JOB_ID \ 2025-07-17T08:43:33.7196266Z  -e JOB_NAME \ 2025-07-17T08:43:33.7196700Z  -e BRANCH \ 2025-07-17T08:43:33.7197126Z  -e SHA1 \ 2025-07-17T08:43:33.7197559Z  -e AWS_DEFAULT_REGION \ 2025-07-17T08:43:33.7198068Z  -e IN_WHEEL_TEST \ 2025-07-17T08:43:33.7198543Z  -e SHARD_NUMBER \ 2025-07-17T08:43:33.7199034Z  -e TEST_CONFIG \ 2025-07-17T08:43:33.7199665Z  -e NUM_TEST_SHARDS \ 2025-07-17T08:43:33.7200159Z  -e REENABLED_ISSUES \ 2025-07-17T08:43:33.7200675Z  -e CONTINUE_THROUGH_ERROR \ 2025-07-17T08:43:33.7201216Z  -e VERBOSE_TEST_LOGS \ 2025-07-17T08:43:33.7201725Z  -e TEST_SHOWLOCALS \ 2025-07-17T08:43:33.7202197Z  -e NO_TEST_TIMEOUT \ 2025-07-17T08:43:33.7202681Z  -e NO_TD \ 2025-07-17T08:43:33.7203166Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-07-17T08:43:33.7203790Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-07-17T08:43:33.7204421Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-07-17T08:43:33.7205006Z  -e TESTS_TO_INCLUDE \ 2025-07-17T08:43:33.7205505Z  -e DASHBOARD_TAG \ 2025-07-17T08:43:33.7206251Z  --env-file="${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" \ 2025-07-17T08:43:33.7207094Z  --ulimit stack=10485760:83886080 \ 2025-07-17T08:43:33.7207732Z  --ulimit core=0 \ 2025-07-17T08:43:33.7208354Z  --security-opt seccomp=unconfined \ 2025-07-17T08:43:33.7209041Z  --cap-add=SYS_PTRACE \ 2025-07-17T08:43:33.7209632Z  --shm-size="8g" \ 2025-07-17T08:43:33.7210150Z  --tty \ 2025-07-17T08:43:33.7210643Z  --detach \ 2025-07-17T08:43:33.7211190Z  --name="${container_name}" \ 2025-07-17T08:43:33.7211824Z  --user jenkins \ 2025-07-17T08:43:33.7212256Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-07-17T08:43:33.7212669Z  -w /var/lib/jenkins/workspace \ 2025-07-17T08:43:33.7212953Z  "${DOCKER_IMAGE}" 2025-07-17T08:43:33.7213186Z ) 2025-07-17T08:43:33.7213415Z # save container name for later step 2025-07-17T08:43:33.7213985Z echo "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV" 2025-07-17T08:43:33.7214654Z # jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home 2025-07-17T08:43:33.7215494Z docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}" 2025-07-17T08:43:33.7256727Z shell: /usr/bin/bash -e {0} 2025-07-17T08:43:33.7256994Z env: 2025-07-17T08:43:33.7257206Z GIT_DEFAULT_BRANCH: main 2025-07-17T08:43:33.7257586Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T08:43:33.7258299Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T08:43:33.7259498Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T08:43:33.7261404Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T08:43:33.7262906Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T08:43:33.7263381Z AWS_REGION: us-east-1 2025-07-17T08:43:33.7263925Z AWS_ACCESS_KEY_ID: *** 2025-07-17T08:43:33.7264564Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T08:43:33.7274243Z AWS_SESSION_TOKEN: *** 2025-07-17T08:43:33.7274758Z BUILD_ENVIRONMENT: linux-jammy-rocm-py3.10 2025-07-17T08:43:33.7275317Z PR_NUMBER: 2025-07-17T08:43:33.7276108Z GITHUB_REPOSITORY: pytorch/pytorch 2025-07-17T08:43:33.7276646Z GITHUB_WORKFLOW: inductor-rocm 2025-07-17T08:43:33.7277125Z GITHUB_JOB: test 2025-07-17T08:43:33.7277531Z GITHUB_RUN_ID: 16337959866 2025-07-17T08:43:33.7277993Z GITHUB_RUN_NUMBER: 38617 2025-07-17T08:43:33.7278439Z GITHUB_RUN_ATTEMPT: 1 2025-07-17T08:43:33.7278854Z JOB_ID: 46159470975 2025-07-17T08:43:33.7279639Z JOB_NAME: rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:43:33.7280368Z BRANCH: main 2025-07-17T08:43:33.7280823Z SHA1: a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:43:33.7281403Z CONTINUE_THROUGH_ERROR: True 2025-07-17T08:43:33.7281879Z VERBOSE_TEST_LOGS: False 2025-07-17T08:43:33.7282317Z TEST_SHOWLOCALS: False 2025-07-17T08:43:33.7282755Z NO_TEST_TIMEOUT: False 2025-07-17T08:43:33.7283167Z NO_TD: False 2025-07-17T08:43:33.7283546Z TEST_CONFIG: inductor 2025-07-17T08:43:33.7283959Z SHARD_NUMBER: 2 2025-07-17T08:43:33.7284351Z NUM_TEST_SHARDS: 2 2025-07-17T08:43:33.7284766Z REENABLED_ISSUES: 2025-07-17T08:43:33.7286047Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:43:33.7287631Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2025-07-17T08:43:33.7288274Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-07-17T08:43:33.7288870Z TESTS_TO_INCLUDE: 2025-07-17T08:43:33.7289328Z DASHBOARD_TAG: 2025-07-17T08:43:33.7289776Z ##[endgroup] 2025-07-17T08:43:33.7354057Z + [[ inductor == \m\u\l\t\i\g\p\u ]] 2025-07-17T08:43:33.7354793Z + [[ linux-jammy-rocm-py3.10 == *onnx* ]] 2025-07-17T08:43:33.7355418Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-07-17T08:43:33.7364398Z +++ nproc --ignore=2 2025-07-17T08:43:33.7390861Z ++ docker run --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=62 -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE -e DASHBOARD_TAG --env-file=/home/pytorchci/actions-runner/_work/_temp/github_env_16337959866 --ulimit stack=10485760:83886080 --ulimit core=0 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /home/pytorchci/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-01345e7669bb7198df9fce7a02a4a12ce8c84f2d 2025-07-17T08:43:38.1627658Z + container_name=55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T08:43:38.1629233Z + echo CONTAINER_NAME=55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T08:43:38.1634802Z + docker exec -t 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh' 2025-07-17T08:43:59.5178684Z Processing ./dist/torch-2.9.0a0+gita38f433-cp310-cp310-linux_x86_64.whl 2025-07-17T08:44:00.2956377Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+gita38f433) (3.18.0) 2025-07-17T08:44:00.2959177Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+gita38f433) (4.14.1) 2025-07-17T08:44:00.2962295Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+gita38f433) (1.13.3) 2025-07-17T08:44:00.2965378Z Requirement already satisfied: networkx in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+gita38f433) (2.8.8) 2025-07-17T08:44:00.2968505Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+gita38f433) (3.1.6) 2025-07-17T08:44:00.2971260Z Requirement already satisfied: fsspec in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+gita38f433) (2025.5.1) 2025-07-17T08:44:00.2984619Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.9.0a0+gita38f433) (1.3.0) 2025-07-17T08:44:00.3292813Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.9.0a0+gita38f433) (3.0.2) 2025-07-17T08:44:00.8696448Z Installing collected packages: torch 2025-07-17T08:44:11.6949333Z ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. 2025-07-17T08:44:11.6953665Z helion 0.0.9 requires filecheck, which is not installed. 2025-07-17T08:44:11.6955066Z timm 1.0.14 requires torchvision, which is not installed. 2025-07-17T08:44:11.6955997Z Successfully installed torch-2.9.0a0+gita38f433 2025-07-17T08:44:11.7493390Z + export TERM=vt100 2025-07-17T08:44:11.7493931Z + TERM=vt100 2025-07-17T08:44:11.7501735Z ++ dirname .ci/pytorch/test.sh 2025-07-17T08:44:11.7526109Z + source .ci/pytorch/common.sh 2025-07-17T08:44:11.7535294Z +++ dirname .ci/pytorch/common.sh 2025-07-17T08:44:11.7545422Z ++ source .ci/pytorch/common_utils.sh 2025-07-17T08:44:11.7546601Z +++ declare -f -t trap_add 2025-07-17T08:44:11.7550653Z ++ set -ex -o pipefail 2025-07-17T08:44:11.7551265Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-07-17T08:44:11.7551898Z ++ unset HIP_PLATFORM 2025-07-17T08:44:11.7552412Z ++ export PYTORCH_TEST_WITH_ROCM=1 2025-07-17T08:44:11.7553019Z ++ PYTORCH_TEST_WITH_ROCM=1 2025-07-17T08:44:11.7553497Z ++ BUILD_TEST_LIBTORCH=0 2025-07-17T08:44:11.7558942Z ++ dirname .ci/pytorch/test.sh 2025-07-17T08:44:11.7583091Z + source .ci/pytorch/common-build.sh 2025-07-17T08:44:11.7585883Z ++ [[ linux-jammy-rocm-py3.10 != *win-* ]] 2025-07-17T08:44:11.7601385Z ++++ dirname .ci/pytorch/common-build.sh 2025-07-17T08:44:11.7616863Z +++ cd .ci/pytorch 2025-07-17T08:44:11.7617405Z +++ pwd -P 2025-07-17T08:44:11.7622169Z ++ script_dir=/var/lib/jenkins/pytorch/.ci/pytorch 2025-07-17T08:44:11.7623802Z ++ [[ linux-jammy-rocm-py3.10 == *-pch* ]] 2025-07-17T08:44:11.7624398Z ++ which sccache 2025-07-17T08:44:11.7635305Z ++ [[ -z '' ]] 2025-07-17T08:44:11.7635573Z ++ unset SCCACHE_BUCKET 2025-07-17T08:44:11.7635839Z ++ unset SCCACHE_REGION 2025-07-17T08:44:11.7636093Z ++ sccache --stop-server 2025-07-17T08:44:11.7668580Z ++ true 2025-07-17T08:44:11.7669120Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-07-17T08:44:11.7683609Z ++ trap_add sccache_epilogue EXIT 2025-07-17T08:44:11.7684378Z ++ trap_add_cmd=sccache_epilogue 2025-07-17T08:44:11.7684701Z ++ shift 2025-07-17T08:44:11.7684961Z ++ for trap_add_name in "$@" 2025-07-17T08:44:11.7691688Z ++++ trap -p EXIT 2025-07-17T08:44:11.7694879Z +++ eval 'extract_trap_cmd ' 2025-07-17T08:44:11.7695188Z ++++ extract_trap_cmd 2025-07-17T08:44:11.7695433Z ++++ printf '%s\n' '' 2025-07-17T08:44:11.7695687Z +++ printf '%s\n' sccache_epilogue 2025-07-17T08:44:11.7698400Z ++ trap -- ' 2025-07-17T08:44:11.7698664Z sccache_epilogue' EXIT 2025-07-17T08:44:11.7698910Z ++ [[ -n '' ]] 2025-07-17T08:44:11.7699181Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-07-17T08:44:11.7699560Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-07-17T08:44:11.7699907Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-07-17T08:44:11.7700157Z ++ sccache --start-server 2025-07-17T08:44:11.7725982Z sccache: Starting the server... 2025-07-17T08:44:11.7867384Z sccache: Listening on address 127.0.0.1:4226 2025-07-17T08:44:11.7876415Z ++ sccache --zero-stats 2025-07-17T08:44:11.7902467Z Statistics zeroed. 2025-07-17T08:44:11.7910044Z ++ which ccache 2025-07-17T08:44:11.7923740Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-07-17T08:44:11.7924379Z + echo 'Environment variables:' 2025-07-17T08:44:11.7924875Z Environment variables: 2025-07-17T08:44:11.7925295Z + env 2025-07-17T08:44:11.7934891Z GITHUB_WORKSPACE=/home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-07-17T08:44:11.7935753Z CONTINUE_THROUGH_ERROR=True 2025-07-17T08:44:11.7936293Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-07-17T08:44:11.7936903Z HOSTNAME=pytorch-rocm-hw-43 2025-07-17T08:44:11.7937949Z GITHUB_PATH=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/add_path_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.7938992Z GITHUB_ACTION=__self 2025-07-17T08:44:11.7939444Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-07-17T08:44:11.7939962Z GITHUB_RUN_NUMBER=38617 2025-07-17T08:44:11.7940375Z TEST_CONFIG=inductor 2025-07-17T08:44:11.7940865Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-07-17T08:44:11.7941485Z AWS_DEFAULT_REGION=us-east-1 2025-07-17T08:44:11.7942077Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-07-17T08:44:11.7942698Z GITHUB_REF_TYPE=branch 2025-07-17T08:44:11.7946207Z *** 2025-07-17T08:44:11.7946464Z GITHUB_REPOSITORY_ID=65600975 2025-07-17T08:44:11.7946770Z GITHUB_ACTIONS=true 2025-07-17T08:44:11.7947040Z SHA1=a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:44:11.7947388Z GITHUB_SHA=a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:44:11.7947895Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/inductor-rocm.yml@refs/heads/main 2025-07-17T08:44:11.7948352Z UCC_HOME=/usr 2025-07-17T08:44:11.7948578Z VERBOSE_TEST_LOGS=False 2025-07-17T08:44:11.7948824Z GITHUB_REF=refs/heads/main 2025-07-17T08:44:11.7949067Z SHARD_NUMBER=2 2025-07-17T08:44:11.7949296Z GITHUB_REF_PROTECTED=true 2025-07-17T08:44:11.7949554Z HOME=/var/lib/jenkins 2025-07-17T08:44:11.7949811Z GITHUB_API_URL=https://api.github.com 2025-07-17T08:44:11.7950128Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-07-17T08:44:11.7950408Z LANG=C.UTF-8 2025-07-17T08:44:11.7950665Z UCX_COMMIT=cc312eaa4655c0cc5c2bcd796db938f90563bcf6 2025-07-17T08:44:11.7950992Z PYTORCH_TEST_WITH_ROCM=1 2025-07-17T08:44:11.7951235Z NUM_TEST_SHARDS=2 2025-07-17T08:44:11.7951441Z UCX_HOME=/usr 2025-07-17T08:44:11.7951966Z GITHUB_STATE=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/save_state_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.7952667Z JOB_NAME=rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:44:11.7953273Z MAGMA_HOME=/opt/rocm/magma 2025-07-17T08:44:11.7953816Z GITHUB_ENV=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_env_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.7954550Z GITHUB_EVENT_PATH=/home/pytorchci/actions-runner/_work/_temp/_github_workflow/event.json 2025-07-17T08:44:11.7955010Z GITHUB_EVENT_NAME=push 2025-07-17T08:44:11.7955239Z DASHBOARD_TAG= 2025-07-17T08:44:11.7955459Z GITHUB_RUN_ID=16337959866 2025-07-17T08:44:11.7956046Z GITHUB_STEP_SUMMARY=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/step_summary_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.7956665Z GITHUB_ACTOR=pytorchmergebot 2025-07-17T08:44:11.7956923Z PR_NUMBER= 2025-07-17T08:44:11.7957130Z GITHUB_RUN_ATTEMPT=1 2025-07-17T08:44:11.7957369Z ANACONDA_PYTHON_VERSION=3.10 2025-07-17T08:44:11.7957682Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-07-17T08:44:11.7958000Z TERM=vt100 2025-07-17T08:44:11.7958203Z INSTALLED_VISION=yes 2025-07-17T08:44:11.7958417Z BRANCH=main 2025-07-17T08:44:11.7958638Z OPENSSL_ROOT_DIR=/opt/openssl 2025-07-17T08:44:11.7958912Z TESTS_TO_INCLUDE= 2025-07-17T08:44:11.7959443Z GITHUB_ACTION_PATH=/home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-07-17T08:44:11.7959974Z GITHUB_SERVER_URL=https://github.com 2025-07-17T08:44:11.7960265Z PYTORCH_ROCM_ARCH=gfx90a;gfx942 2025-07-17T08:44:11.7960559Z UCC_COMMIT=0c0fc21559835044ab107199e334f7157d6a0d3d 2025-07-17T08:44:11.7961041Z REENABLED_ISSUES= 2025-07-17T08:44:11.7961253Z SHLVL=1 2025-07-17T08:44:11.7961444Z MAX_JOBS=62 2025-07-17T08:44:11.7961648Z GITHUB_ACTOR_ID=97764156 2025-07-17T08:44:11.7961961Z GITHUB_WORKFLOW_SHA=a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:44:11.7962295Z GITHUB_REF_NAME=main 2025-07-17T08:44:11.7962535Z ROCM_PATH=/opt/rocm 2025-07-17T08:44:11.7962740Z GITHUB_JOB=test 2025-07-17T08:44:11.7962947Z NO_TEST_TIMEOUT=False 2025-07-17T08:44:11.7963187Z GITHUB_REPOSITORY=pytorch/pytorch 2025-07-17T08:44:11.7963447Z LC_ALL=C.UTF-8 2025-07-17T08:44:11.7963658Z GITHUB_RETENTION_DAYS=90 2025-07-17T08:44:11.7963896Z OPENSSL_DIR=/opt/openssl 2025-07-17T08:44:11.7964132Z GITHUB_ACTION_REPOSITORY= 2025-07-17T08:44:11.7964983Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-07-17T08:44:11.7965842Z GITHUB_BASE_REF= 2025-07-17T08:44:11.7966050Z CI=true 2025-07-17T08:44:11.7966254Z GITHUB_REPOSITORY_OWNER=pytorch 2025-07-17T08:44:11.7966505Z JOB_ID=46159470975 2025-07-17T08:44:11.7966708Z GITHUB_HEAD_REF= 2025-07-17T08:44:11.7966912Z GITHUB_ACTION_REF= 2025-07-17T08:44:11.7967119Z TEST_SHOWLOCALS=False 2025-07-17T08:44:11.7967348Z GITHUB_WORKFLOW=inductor-rocm 2025-07-17T08:44:11.7967609Z DEBIAN_FRONTEND=noninteractive 2025-07-17T08:44:11.7968174Z GITHUB_OUTPUT=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_output_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.7968756Z NO_TD=False 2025-07-17T08:44:11.7968950Z OLDPWD=/var/lib/jenkins 2025-07-17T08:44:11.7969168Z _=/usr/bin/env 2025-07-17T08:44:11.7969448Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-07-17T08:44:11.8061948Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-07-17T08:44:11.8062867Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-07-17T08:44:11.8063719Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-07-17T08:44:11.8064526Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-07-17T08:44:11.8065134Z + BUILD_DIR=build 2025-07-17T08:44:11.8065477Z + BUILD_RENAMED_DIR=build_renamed 2025-07-17T08:44:11.8065876Z + BUILD_BIN_DIR=build/bin 2025-07-17T08:44:11.8066223Z + SHARD_NUMBER=2 2025-07-17T08:44:11.8066531Z + NUM_TEST_SHARDS=2 2025-07-17T08:44:11.8066881Z + export TORCH_SERIALIZATION_DEBUG=1 2025-07-17T08:44:11.8067308Z + TORCH_SERIALIZATION_DEBUG=1 2025-07-17T08:44:11.8068045Z + export VALGRIND=ON 2025-07-17T08:44:11.8068458Z + VALGRIND=ON 2025-07-17T08:44:11.8068877Z + [[ linux-jammy-rocm-py3.10 == *clang9* ]] 2025-07-17T08:44:11.8069435Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-07-17T08:44:11.8069995Z + [[ linux-jammy-rocm-py3.10 == *s390x* ]] 2025-07-17T08:44:11.8070475Z + [[ 0 == \1 ]] 2025-07-17T08:44:11.8070826Z + [[ True == \1 ]] 2025-07-17T08:44:11.8071238Z + [[ linux-jammy-rocm-py3.10 != *bazel* ]] 2025-07-17T08:44:11.8071799Z ++ realpath build/custom_test_artifacts 2025-07-17T08:44:11.8081336Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts 2025-07-17T08:44:11.8081803Z + [[ -n '' ]] 2025-07-17T08:44:11.8082048Z + echo 'Environment variables' 2025-07-17T08:44:11.8082340Z Environment variables 2025-07-17T08:44:11.8082568Z + env 2025-07-17T08:44:11.8091441Z GITHUB_WORKSPACE=/home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-07-17T08:44:11.8091855Z CONTINUE_THROUGH_ERROR=True 2025-07-17T08:44:11.8092159Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-07-17T08:44:11.8092482Z HOSTNAME=pytorch-rocm-hw-43 2025-07-17T08:44:11.8093072Z GITHUB_PATH=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/add_path_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.8093649Z GITHUB_ACTION=__self 2025-07-17T08:44:11.8093902Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-07-17T08:44:11.8094181Z GITHUB_RUN_NUMBER=38617 2025-07-17T08:44:11.8094642Z TEST_CONFIG=inductor 2025-07-17T08:44:11.8094881Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-07-17T08:44:11.8095166Z AWS_DEFAULT_REGION=us-east-1 2025-07-17T08:44:11.8095438Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-07-17T08:44:11.8095728Z GITHUB_REF_TYPE=branch 2025-07-17T08:44:11.8096007Z *** 2025-07-17T08:44:11.8096215Z GITHUB_REPOSITORY_ID=65600975 2025-07-17T08:44:11.8096476Z GITHUB_ACTIONS=true 2025-07-17T08:44:11.8096746Z SHA1=a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:44:11.8097081Z GITHUB_SHA=a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:44:11.8097579Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/inductor-rocm.yml@refs/heads/main 2025-07-17T08:44:11.8098018Z UCC_HOME=/usr 2025-07-17T08:44:11.8098244Z TORCH_SERIALIZATION_DEBUG=1 2025-07-17T08:44:11.8098500Z VERBOSE_TEST_LOGS=False 2025-07-17T08:44:11.8098737Z GITHUB_REF=refs/heads/main 2025-07-17T08:44:11.8098977Z SHARD_NUMBER=2 2025-07-17T08:44:11.8099193Z GITHUB_REF_PROTECTED=true 2025-07-17T08:44:11.8099442Z HOME=/var/lib/jenkins 2025-07-17T08:44:11.8099700Z GITHUB_API_URL=https://api.github.com 2025-07-17T08:44:11.8100010Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-07-17T08:44:11.8100282Z LANG=C.UTF-8 2025-07-17T08:44:11.8100529Z UCX_COMMIT=cc312eaa4655c0cc5c2bcd796db938f90563bcf6 2025-07-17T08:44:11.8100857Z PYTORCH_TEST_WITH_ROCM=1 2025-07-17T08:44:11.8101103Z NUM_TEST_SHARDS=2 2025-07-17T08:44:11.8101318Z UCX_HOME=/usr 2025-07-17T08:44:11.8101843Z GITHUB_STATE=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/save_state_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.8102535Z JOB_NAME=rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T08:44:11.8102928Z MAGMA_HOME=/opt/rocm/magma 2025-07-17T08:44:11.8103461Z GITHUB_ENV=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_env_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.8104187Z GITHUB_EVENT_PATH=/home/pytorchci/actions-runner/_work/_temp/_github_workflow/event.json 2025-07-17T08:44:11.8104650Z GITHUB_EVENT_NAME=push 2025-07-17T08:44:11.8104880Z DASHBOARD_TAG= 2025-07-17T08:44:11.8105095Z GITHUB_RUN_ID=16337959866 2025-07-17T08:44:11.8105662Z GITHUB_STEP_SUMMARY=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/step_summary_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.8106280Z GITHUB_ACTOR=pytorchmergebot 2025-07-17T08:44:11.8106532Z PR_NUMBER= 2025-07-17T08:44:11.8106741Z GITHUB_RUN_ATTEMPT=1 2025-07-17T08:44:11.8106965Z VALGRIND=ON 2025-07-17T08:44:11.8107185Z ANACONDA_PYTHON_VERSION=3.10 2025-07-17T08:44:11.8107654Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-07-17T08:44:11.8107977Z TERM=vt100 2025-07-17T08:44:11.8108179Z INSTALLED_VISION=yes 2025-07-17T08:44:11.8108396Z BRANCH=main 2025-07-17T08:44:11.8108605Z OPENSSL_ROOT_DIR=/opt/openssl 2025-07-17T08:44:11.8108844Z TESTS_TO_INCLUDE= 2025-07-17T08:44:11.8109288Z GITHUB_ACTION_PATH=/home/pytorchci/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-07-17T08:44:11.8109805Z GITHUB_SERVER_URL=https://github.com 2025-07-17T08:44:11.8110104Z PYTORCH_ROCM_ARCH=gfx90a;gfx942 2025-07-17T08:44:11.8110396Z UCC_COMMIT=0c0fc21559835044ab107199e334f7157d6a0d3d 2025-07-17T08:44:11.8110705Z REENABLED_ISSUES= 2025-07-17T08:44:11.8110920Z SHLVL=1 2025-07-17T08:44:11.8111106Z MAX_JOBS=62 2025-07-17T08:44:11.8111310Z GITHUB_ACTOR_ID=97764156 2025-07-17T08:44:11.8111620Z GITHUB_WORKFLOW_SHA=a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T08:44:11.8111967Z GITHUB_REF_NAME=main 2025-07-17T08:44:11.8112187Z ROCM_PATH=/opt/rocm 2025-07-17T08:44:11.8112400Z GITHUB_JOB=test 2025-07-17T08:44:11.8112621Z NO_TEST_TIMEOUT=False 2025-07-17T08:44:11.8112867Z GITHUB_REPOSITORY=pytorch/pytorch 2025-07-17T08:44:11.8113131Z LC_ALL=C.UTF-8 2025-07-17T08:44:11.8113348Z GITHUB_RETENTION_DAYS=90 2025-07-17T08:44:11.8113592Z OPENSSL_DIR=/opt/openssl 2025-07-17T08:44:11.8113831Z GITHUB_ACTION_REPOSITORY= 2025-07-17T08:44:11.8114680Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-07-17T08:44:11.8115699Z GITHUB_BASE_REF= 2025-07-17T08:44:11.8115904Z CI=true 2025-07-17T08:44:11.8116103Z GITHUB_REPOSITORY_OWNER=pytorch 2025-07-17T08:44:11.8116358Z JOB_ID=46159470975 2025-07-17T08:44:11.8116568Z GITHUB_HEAD_REF= 2025-07-17T08:44:11.8116777Z GITHUB_ACTION_REF= 2025-07-17T08:44:11.8116995Z TEST_SHOWLOCALS=False 2025-07-17T08:44:11.8117238Z GITHUB_WORKFLOW=inductor-rocm 2025-07-17T08:44:11.8117504Z DEBIAN_FRONTEND=noninteractive 2025-07-17T08:44:11.8118086Z GITHUB_OUTPUT=/home/pytorchci/actions-runner/_work/_temp/_runner_file_commands/set_output_5dabb3ea-9759-4eca-8775-3e51f7912b28 2025-07-17T08:44:11.8118866Z NO_TD=False 2025-07-17T08:44:11.8119230Z OLDPWD=/var/lib/jenkins 2025-07-17T08:44:11.8119725Z _=/usr/bin/env 2025-07-17T08:44:11.8120117Z + echo 'Testing pytorch' 2025-07-17T08:44:11.8120529Z Testing pytorch 2025-07-17T08:44:11.8120892Z + export LANG=C.UTF-8 2025-07-17T08:44:11.8121233Z + LANG=C.UTF-8 2025-07-17T08:44:11.8121533Z + PR_NUMBER= 2025-07-17T08:44:11.8121850Z + [[ inductor == \d\e\f\a\u\l\t ]] 2025-07-17T08:44:11.8122267Z + [[ inductor == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-07-17T08:44:11.8122691Z + [[ inductor == \s\l\o\w ]] 2025-07-17T08:44:11.8123123Z + [[ linux-jammy-rocm-py3.10 == *slow-gradcheck* ]] 2025-07-17T08:44:11.8123627Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-07-17T08:44:11.8124073Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-07-17T08:44:11.8124525Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-07-17T08:44:11.8125007Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-07-17T08:44:11.8125433Z + [[ inductor == *crossref* ]] 2025-07-17T08:44:11.8125828Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-07-17T08:44:11.8126246Z + export VALGRIND=OFF 2025-07-17T08:44:11.8126572Z + VALGRIND=OFF 2025-07-17T08:44:11.8126865Z + rocminfo 2025-07-17T08:44:11.8230038Z ROCk module version 6.8.5 is loaded 2025-07-17T08:44:11.9309727Z ===================== 2025-07-17T08:44:11.9310379Z HSA System Attributes 2025-07-17T08:44:11.9310855Z ===================== 2025-07-17T08:44:11.9311305Z Runtime Version: 1.15 2025-07-17T08:44:11.9311790Z Runtime Ext Version: 1.7 2025-07-17T08:44:11.9312329Z System Timestamp Freq.: 1000.000000MHz 2025-07-17T08:44:11.9313210Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-07-17T08:44:11.9314132Z Machine Model: LARGE 2025-07-17T08:44:11.9315373Z System Endianness: LITTLE 2025-07-17T08:44:11.9316050Z Mwaitx: DISABLED 2025-07-17T08:44:11.9316553Z XNACK enabled: NO 2025-07-17T08:44:11.9317043Z DMAbuf Support: YES 2025-07-17T08:44:11.9317512Z VMM Support: YES 2025-07-17T08:44:11.9333129Z 2025-07-17T08:44:11.9333248Z ========== 2025-07-17T08:44:11.9333534Z HSA Agents 2025-07-17T08:44:11.9333786Z ========== 2025-07-17T08:44:11.9334028Z ******* 2025-07-17T08:44:11.9334256Z Agent 1 2025-07-17T08:44:11.9334481Z ******* 2025-07-17T08:44:11.9334784Z Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:44:11.9335169Z Uuid: CPU-XX 2025-07-17T08:44:11.9335532Z Marketing Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:44:11.9335900Z Vendor Name: CPU 2025-07-17T08:44:11.9336251Z Feature: None specified 2025-07-17T08:44:11.9336610Z Profile: FULL_PROFILE 2025-07-17T08:44:11.9336971Z Float Round Mode: NEAR 2025-07-17T08:44:11.9337335Z Max Queue Number: 0(0x0) 2025-07-17T08:44:11.9337692Z Queue Min Size: 0(0x0) 2025-07-17T08:44:11.9338316Z Queue Max Size: 0(0x0) 2025-07-17T08:44:11.9338748Z Queue Type: MULTI 2025-07-17T08:44:11.9339142Z Node: 0 2025-07-17T08:44:11.9339535Z Device Type: CPU 2025-07-17T08:44:11.9339907Z Cache Info: 2025-07-17T08:44:11.9340217Z L1: 32768(0x8000) KB 2025-07-17T08:44:11.9340551Z Chip ID: 0(0x0) 2025-07-17T08:44:11.9340892Z ASIC Revision: 0(0x0) 2025-07-17T08:44:11.9341255Z Cacheline Size: 64(0x40) 2025-07-17T08:44:11.9341608Z Max Clock Freq. (MHz): 2600 2025-07-17T08:44:11.9341941Z BDFID: 0 2025-07-17T08:44:11.9342278Z Internal Node ID: 0 2025-07-17T08:44:11.9342648Z Compute Unit: 32 2025-07-17T08:44:11.9342993Z SIMDs per CU: 0 2025-07-17T08:44:11.9343347Z Shader Engines: 0 2025-07-17T08:44:11.9343717Z Shader Arrs. per Eng.: 0 2025-07-17T08:44:11.9344100Z WatchPts on Addr. Ranges:1 2025-07-17T08:44:11.9344439Z Memory Properties: 2025-07-17T08:44:11.9344706Z Features: None 2025-07-17T08:44:11.9344971Z Pool Info: 2025-07-17T08:44:11.9345213Z Pool 1 2025-07-17T08:44:11.9345512Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-07-17T08:44:11.9345871Z Size: 65790776(0x3ebe338) KB 2025-07-17T08:44:11.9346225Z Allocatable: TRUE 2025-07-17T08:44:11.9346602Z Alloc Granule: 4KB 2025-07-17T08:44:11.9346993Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9347385Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9347766Z Accessible by all: TRUE 2025-07-17T08:44:11.9348095Z Pool 2 2025-07-17T08:44:11.9348399Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-07-17T08:44:11.9348923Z Size: 65790776(0x3ebe338) KB 2025-07-17T08:44:11.9349300Z Allocatable: TRUE 2025-07-17T08:44:11.9349673Z Alloc Granule: 4KB 2025-07-17T08:44:11.9350061Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9350453Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9350826Z Accessible by all: TRUE 2025-07-17T08:44:11.9351154Z Pool 3 2025-07-17T08:44:11.9351454Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-07-17T08:44:11.9351801Z Size: 65790776(0x3ebe338) KB 2025-07-17T08:44:11.9352142Z Allocatable: TRUE 2025-07-17T08:44:11.9352493Z Alloc Granule: 4KB 2025-07-17T08:44:11.9352876Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9353257Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9353626Z Accessible by all: TRUE 2025-07-17T08:44:11.9353948Z Pool 4 2025-07-17T08:44:11.9354237Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:44:11.9354578Z Size: 65790776(0x3ebe338) KB 2025-07-17T08:44:11.9355070Z Allocatable: TRUE 2025-07-17T08:44:11.9355427Z Alloc Granule: 4KB 2025-07-17T08:44:11.9355851Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9356230Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9356597Z Accessible by all: TRUE 2025-07-17T08:44:11.9356918Z ISA Info: 2025-07-17T08:44:11.9357165Z ******* 2025-07-17T08:44:11.9357394Z Agent 2 2025-07-17T08:44:11.9357619Z ******* 2025-07-17T08:44:11.9357892Z Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:44:11.9358267Z Uuid: CPU-XX 2025-07-17T08:44:11.9358811Z Marketing Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:44:11.9359807Z Vendor Name: CPU 2025-07-17T08:44:11.9360569Z Feature: None specified 2025-07-17T08:44:11.9361297Z Profile: FULL_PROFILE 2025-07-17T08:44:11.9361921Z Float Round Mode: NEAR 2025-07-17T08:44:11.9362571Z Max Queue Number: 0(0x0) 2025-07-17T08:44:11.9363206Z Queue Min Size: 0(0x0) 2025-07-17T08:44:11.9363835Z Queue Max Size: 0(0x0) 2025-07-17T08:44:11.9364450Z Queue Type: MULTI 2025-07-17T08:44:11.9365035Z Node: 1 2025-07-17T08:44:11.9365625Z Device Type: CPU 2025-07-17T08:44:11.9366182Z Cache Info: 2025-07-17T08:44:11.9366654Z L1: 32768(0x8000) KB 2025-07-17T08:44:11.9367235Z Chip ID: 0(0x0) 2025-07-17T08:44:11.9367846Z ASIC Revision: 0(0x0) 2025-07-17T08:44:11.9368581Z Cacheline Size: 64(0x40) 2025-07-17T08:44:11.9369343Z Max Clock Freq. (MHz): 2600 2025-07-17T08:44:11.9370038Z BDFID: 0 2025-07-17T08:44:11.9371084Z Internal Node ID: 1 2025-07-17T08:44:11.9371867Z Compute Unit: 32 2025-07-17T08:44:11.9372601Z SIMDs per CU: 0 2025-07-17T08:44:11.9373337Z Shader Engines: 0 2025-07-17T08:44:11.9374025Z Shader Arrs. per Eng.: 0 2025-07-17T08:44:11.9374463Z WatchPts on Addr. Ranges:1 2025-07-17T08:44:11.9374841Z Memory Properties: 2025-07-17T08:44:11.9375099Z Features: None 2025-07-17T08:44:11.9375354Z Pool Info: 2025-07-17T08:44:11.9375592Z Pool 1 2025-07-17T08:44:11.9375890Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-07-17T08:44:11.9376248Z Size: 66046476(0x3efca0c) KB 2025-07-17T08:44:11.9376582Z Allocatable: TRUE 2025-07-17T08:44:11.9376951Z Alloc Granule: 4KB 2025-07-17T08:44:11.9377331Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9377715Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9378085Z Accessible by all: TRUE 2025-07-17T08:44:11.9378410Z Pool 2 2025-07-17T08:44:11.9378861Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-07-17T08:44:11.9379211Z Size: 66046476(0x3efca0c) KB 2025-07-17T08:44:11.9379552Z Allocatable: TRUE 2025-07-17T08:44:11.9379922Z Alloc Granule: 4KB 2025-07-17T08:44:11.9380373Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9381182Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9382026Z Accessible by all: TRUE 2025-07-17T08:44:11.9382746Z Pool 3 2025-07-17T08:44:11.9383336Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-07-17T08:44:11.9383980Z Size: 66046476(0x3efca0c) KB 2025-07-17T08:44:11.9384627Z Allocatable: TRUE 2025-07-17T08:44:11.9385319Z Alloc Granule: 4KB 2025-07-17T08:44:11.9386033Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9386746Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9387446Z Accessible by all: TRUE 2025-07-17T08:44:11.9388046Z Pool 4 2025-07-17T08:44:11.9388588Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:44:11.9389246Z Size: 66046476(0x3efca0c) KB 2025-07-17T08:44:11.9389894Z Allocatable: TRUE 2025-07-17T08:44:11.9390568Z Alloc Granule: 4KB 2025-07-17T08:44:11.9391278Z Alloc Recommended Granule:4KB 2025-07-17T08:44:11.9392035Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9392753Z Accessible by all: TRUE 2025-07-17T08:44:11.9393365Z ISA Info: 2025-07-17T08:44:11.9393800Z ******* 2025-07-17T08:44:11.9394221Z Agent 3 2025-07-17T08:44:11.9394645Z ******* 2025-07-17T08:44:11.9395146Z Name: gfx90a 2025-07-17T08:44:11.9395766Z Uuid: GPU-57cd99ed4cd4643e 2025-07-17T08:44:11.9396440Z Marketing Name: AMD Instinct MI210 2025-07-17T08:44:11.9397426Z Vendor Name: AMD 2025-07-17T08:44:11.9398123Z Feature: KERNEL_DISPATCH 2025-07-17T08:44:11.9398903Z Profile: BASE_PROFILE 2025-07-17T08:44:11.9399830Z Float Round Mode: NEAR 2025-07-17T08:44:11.9400625Z Max Queue Number: 128(0x80) 2025-07-17T08:44:11.9401423Z Queue Min Size: 64(0x40) 2025-07-17T08:44:11.9402201Z Queue Max Size: 131072(0x20000) 2025-07-17T08:44:11.9402977Z Queue Type: MULTI 2025-07-17T08:44:11.9403708Z Node: 2 2025-07-17T08:44:11.9404227Z Device Type: GPU 2025-07-17T08:44:11.9404579Z Cache Info: 2025-07-17T08:44:11.9404839Z L1: 16(0x10) KB 2025-07-17T08:44:11.9405136Z L2: 8192(0x2000) KB 2025-07-17T08:44:11.9405433Z Chip ID: 29711(0x740f) 2025-07-17T08:44:11.9405774Z ASIC Revision: 1(0x1) 2025-07-17T08:44:11.9406126Z Cacheline Size: 64(0x40) 2025-07-17T08:44:11.9406479Z Max Clock Freq. (MHz): 1700 2025-07-17T08:44:11.9406978Z BDFID: 768 2025-07-17T08:44:11.9407313Z Internal Node ID: 2 2025-07-17T08:44:11.9407662Z Compute Unit: 104 2025-07-17T08:44:11.9408002Z SIMDs per CU: 4 2025-07-17T08:44:11.9408342Z Shader Engines: 8 2025-07-17T08:44:11.9408699Z Shader Arrs. per Eng.: 1 2025-07-17T08:44:11.9409060Z WatchPts on Addr. Ranges:4 2025-07-17T08:44:11.9409427Z Coherent Host Access: FALSE 2025-07-17T08:44:11.9409752Z Memory Properties: 2025-07-17T08:44:11.9410010Z Features: KERNEL_DISPATCH 2025-07-17T08:44:11.9410339Z Fast F16 Operation: TRUE 2025-07-17T08:44:11.9410714Z Wavefront Size: 64(0x40) 2025-07-17T08:44:11.9411063Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:44:11.9411394Z Workgroup Max Size per Dimension: 2025-07-17T08:44:11.9411678Z x 1024(0x400) 2025-07-17T08:44:11.9411967Z y 1024(0x400) 2025-07-17T08:44:11.9412251Z z 1024(0x400) 2025-07-17T08:44:11.9412573Z Max Waves Per CU: 32(0x20) 2025-07-17T08:44:11.9412916Z Max Work-item Per CU: 2048(0x800) 2025-07-17T08:44:11.9413261Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:44:11.9413568Z Grid Max Size per Dimension: 2025-07-17T08:44:11.9413818Z x 4294967295(0xffffffff) 2025-07-17T08:44:11.9414110Z y 4294967295(0xffffffff) 2025-07-17T08:44:11.9414406Z z 4294967295(0xffffffff) 2025-07-17T08:44:11.9414731Z Max fbarriers/Workgrp: 32 2025-07-17T08:44:11.9415146Z Packet Processor uCode:: 83 2025-07-17T08:44:11.9415510Z SDMA engine uCode:: 8 2025-07-17T08:44:11.9415861Z IOMMU Support:: None 2025-07-17T08:44:11.9416162Z Pool Info: 2025-07-17T08:44:11.9416547Z Pool 1 2025-07-17T08:44:11.9416841Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:44:11.9417185Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:44:11.9417513Z Allocatable: TRUE 2025-07-17T08:44:11.9417861Z Alloc Granule: 4KB 2025-07-17T08:44:11.9418223Z Alloc Recommended Granule:2048KB 2025-07-17T08:44:11.9418603Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9418955Z Accessible by all: FALSE 2025-07-17T08:44:11.9419258Z Pool 2 2025-07-17T08:44:11.9419532Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-07-17T08:44:11.9419862Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:44:11.9420193Z Allocatable: TRUE 2025-07-17T08:44:11.9420533Z Alloc Granule: 4KB 2025-07-17T08:44:11.9420889Z Alloc Recommended Granule:2048KB 2025-07-17T08:44:11.9421248Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9421601Z Accessible by all: FALSE 2025-07-17T08:44:11.9421903Z Pool 3 2025-07-17T08:44:11.9422299Z Segment: GROUP 2025-07-17T08:44:11.9422610Z Size: 64(0x40) KB 2025-07-17T08:44:11.9422936Z Allocatable: FALSE 2025-07-17T08:44:11.9423280Z Alloc Granule: 0KB 2025-07-17T08:44:11.9423638Z Alloc Recommended Granule:0KB 2025-07-17T08:44:11.9424001Z Alloc Alignment: 0KB 2025-07-17T08:44:11.9424360Z Accessible by all: FALSE 2025-07-17T08:44:11.9424668Z ISA Info: 2025-07-17T08:44:11.9424888Z ISA 1 2025-07-17T08:44:11.9425174Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-07-17T08:44:11.9425554Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-07-17T08:44:11.9425926Z Profiles: HSA_PROFILE_BASE 2025-07-17T08:44:11.9426290Z Default Rounding Mode: NEAR 2025-07-17T08:44:11.9426658Z Default Rounding Mode: NEAR 2025-07-17T08:44:11.9426996Z Fast f16: TRUE 2025-07-17T08:44:11.9427337Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:44:11.9427662Z Workgroup Max Size per Dimension: 2025-07-17T08:44:11.9427956Z x 1024(0x400) 2025-07-17T08:44:11.9428287Z y 1024(0x400) 2025-07-17T08:44:11.9428653Z z 1024(0x400) 2025-07-17T08:44:11.9429351Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:44:11.9430072Z Grid Max Size per Dimension: 2025-07-17T08:44:11.9430673Z x 4294967295(0xffffffff) 2025-07-17T08:44:11.9431330Z y 4294967295(0xffffffff) 2025-07-17T08:44:11.9431883Z z 4294967295(0xffffffff) 2025-07-17T08:44:11.9432511Z FBarrier Max Size: 32 2025-07-17T08:44:11.9433097Z ******* 2025-07-17T08:44:11.9433515Z Agent 4 2025-07-17T08:44:11.9433925Z ******* 2025-07-17T08:44:11.9434644Z Name: gfx90a 2025-07-17T08:44:11.9435294Z Uuid: GPU-6b7c2bd8af06ff6a 2025-07-17T08:44:11.9435971Z Marketing Name: AMD Instinct MI210 2025-07-17T08:44:11.9436651Z Vendor Name: AMD 2025-07-17T08:44:11.9437302Z Feature: KERNEL_DISPATCH 2025-07-17T08:44:11.9437952Z Profile: BASE_PROFILE 2025-07-17T08:44:11.9438622Z Float Round Mode: NEAR 2025-07-17T08:44:11.9439299Z Max Queue Number: 128(0x80) 2025-07-17T08:44:11.9440053Z Queue Min Size: 64(0x40) 2025-07-17T08:44:11.9440700Z Queue Max Size: 131072(0x20000) 2025-07-17T08:44:11.9441337Z Queue Type: MULTI 2025-07-17T08:44:11.9441948Z Node: 3 2025-07-17T08:44:11.9442573Z Device Type: GPU 2025-07-17T08:44:11.9443148Z Cache Info: 2025-07-17T08:44:11.9443627Z L1: 16(0x10) KB 2025-07-17T08:44:11.9444193Z L2: 8192(0x2000) KB 2025-07-17T08:44:11.9444772Z Chip ID: 29711(0x740f) 2025-07-17T08:44:11.9445918Z ASIC Revision: 1(0x1) 2025-07-17T08:44:11.9446584Z Cacheline Size: 64(0x40) 2025-07-17T08:44:11.9447250Z Max Clock Freq. (MHz): 1700 2025-07-17T08:44:11.9447872Z BDFID: 33536 2025-07-17T08:44:11.9448612Z Internal Node ID: 3 2025-07-17T08:44:11.9449391Z Compute Unit: 104 2025-07-17T08:44:11.9450162Z SIMDs per CU: 4 2025-07-17T08:44:11.9450937Z Shader Engines: 8 2025-07-17T08:44:11.9451743Z Shader Arrs. per Eng.: 1 2025-07-17T08:44:11.9452565Z WatchPts on Addr. Ranges:4 2025-07-17T08:44:11.9453399Z Coherent Host Access: FALSE 2025-07-17T08:44:11.9454151Z Memory Properties: 2025-07-17T08:44:11.9454532Z Features: KERNEL_DISPATCH 2025-07-17T08:44:11.9454918Z Fast F16 Operation: TRUE 2025-07-17T08:44:11.9455290Z Wavefront Size: 64(0x40) 2025-07-17T08:44:11.9455649Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:44:11.9455983Z Workgroup Max Size per Dimension: 2025-07-17T08:44:11.9456281Z x 1024(0x400) 2025-07-17T08:44:11.9456577Z y 1024(0x400) 2025-07-17T08:44:11.9456858Z z 1024(0x400) 2025-07-17T08:44:11.9457172Z Max Waves Per CU: 32(0x20) 2025-07-17T08:44:11.9457523Z Max Work-item Per CU: 2048(0x800) 2025-07-17T08:44:11.9457871Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:44:11.9458184Z Grid Max Size per Dimension: 2025-07-17T08:44:11.9458456Z x 4294967295(0xffffffff) 2025-07-17T08:44:11.9458745Z y 4294967295(0xffffffff) 2025-07-17T08:44:11.9459028Z z 4294967295(0xffffffff) 2025-07-17T08:44:11.9459355Z Max fbarriers/Workgrp: 32 2025-07-17T08:44:11.9459730Z Packet Processor uCode:: 83 2025-07-17T08:44:11.9460251Z SDMA engine uCode:: 8 2025-07-17T08:44:11.9460621Z IOMMU Support:: None 2025-07-17T08:44:11.9460925Z Pool Info: 2025-07-17T08:44:11.9461159Z Pool 1 2025-07-17T08:44:11.9461451Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-07-17T08:44:11.9461792Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:44:11.9462135Z Allocatable: TRUE 2025-07-17T08:44:11.9462486Z Alloc Granule: 4KB 2025-07-17T08:44:11.9462851Z Alloc Recommended Granule:2048KB 2025-07-17T08:44:11.9463221Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9463575Z Accessible by all: FALSE 2025-07-17T08:44:11.9463883Z Pool 2 2025-07-17T08:44:11.9464171Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-07-17T08:44:11.9464503Z Size: 67092480(0x3ffc000) KB 2025-07-17T08:44:11.9464826Z Allocatable: TRUE 2025-07-17T08:44:11.9465170Z Alloc Granule: 4KB 2025-07-17T08:44:11.9465528Z Alloc Recommended Granule:2048KB 2025-07-17T08:44:11.9466029Z Alloc Alignment: 4KB 2025-07-17T08:44:11.9466388Z Accessible by all: FALSE 2025-07-17T08:44:11.9466698Z Pool 3 2025-07-17T08:44:11.9466961Z Segment: GROUP 2025-07-17T08:44:11.9467279Z Size: 64(0x40) KB 2025-07-17T08:44:11.9467598Z Allocatable: FALSE 2025-07-17T08:44:11.9467951Z Alloc Granule: 0KB 2025-07-17T08:44:11.9468328Z Alloc Recommended Granule:0KB 2025-07-17T08:44:11.9468691Z Alloc Alignment: 0KB 2025-07-17T08:44:11.9469052Z Accessible by all: FALSE 2025-07-17T08:44:11.9469361Z ISA Info: 2025-07-17T08:44:11.9469582Z ISA 1 2025-07-17T08:44:11.9469882Z Name: amdgcn-amd-amdhsa--gfx90a:sramecc+:xnack- 2025-07-17T08:44:11.9470264Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-07-17T08:44:11.9470629Z Profiles: HSA_PROFILE_BASE 2025-07-17T08:44:11.9470986Z Default Rounding Mode: NEAR 2025-07-17T08:44:11.9471353Z Default Rounding Mode: NEAR 2025-07-17T08:44:11.9471697Z Fast f16: TRUE 2025-07-17T08:44:11.9472042Z Workgroup Max Size: 1024(0x400) 2025-07-17T08:44:11.9472371Z Workgroup Max Size per Dimension: 2025-07-17T08:44:11.9472661Z x 1024(0x400) 2025-07-17T08:44:11.9472950Z y 1024(0x400) 2025-07-17T08:44:11.9473228Z z 1024(0x400) 2025-07-17T08:44:11.9473540Z Grid Max Size: 4294967295(0xffffffff) 2025-07-17T08:44:11.9473849Z Grid Max Size per Dimension: 2025-07-17T08:44:11.9474115Z x 4294967295(0xffffffff) 2025-07-17T08:44:11.9474405Z y 4294967295(0xffffffff) 2025-07-17T08:44:11.9474691Z z 4294967295(0xffffffff) 2025-07-17T08:44:11.9475015Z FBarrier Max Size: 32 2025-07-17T08:44:11.9475477Z *** Done *** 2025-07-17T08:44:11.9475705Z + rocminfo 2025-07-17T08:44:11.9475911Z + grep -E 'Name:.*\sgfx|Marketing' 2025-07-17T08:44:12.0428533Z Marketing Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:44:12.0429471Z Marketing Name: AMD EPYC 7513 32-Core Processor 2025-07-17T08:44:12.0430204Z Name: gfx90a 2025-07-17T08:44:12.0430917Z Marketing Name: AMD Instinct MI210 2025-07-17T08:44:12.0431558Z Name: gfx90a 2025-07-17T08:44:12.0432187Z Marketing Name: AMD Instinct MI210 2025-07-17T08:44:12.0579667Z + MAYBE_ROCM=rocm/ 2025-07-17T08:44:12.0580271Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-07-17T08:44:12.0580926Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-07-17T08:44:12.0581540Z + pip_install ninja==1.10.2 2025-07-17T08:44:12.0582183Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-07-17T08:44:12.0582997Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-07-17T08:44:12.4390067Z Collecting ninja==1.10.2 2025-07-17T08:44:12.4891056Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-07-17T08:44:12.5060987Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-07-17T08:44:12.9902674Z Installing collected packages: ninja 2025-07-17T08:44:12.9904150Z Attempting uninstall: ninja 2025-07-17T08:44:12.9908890Z Found existing installation: ninja 1.11.1.3 2025-07-17T08:44:12.9929066Z Uninstalling ninja-1.11.1.3: 2025-07-17T08:44:13.0049934Z Successfully uninstalled ninja-1.11.1.3 2025-07-17T08:44:13.0357867Z Successfully installed ninja-1.10.2 2025-07-17T08:44:13.0990578Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-07-17T08:44:13.0993964Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-07-17T08:44:13.0995932Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-07-17T08:44:13.0996553Z + [[ linux-jammy-rocm-py3.10 == *asan* ]] 2025-07-17T08:44:13.0997158Z + [[ linux-jammy-rocm-py3.10 == *-debug* ]] 2025-07-17T08:44:13.0997740Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-07-17T08:44:13.0998611Z + echo 'We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass' 2025-07-17T08:44:13.1000144Z We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass 2025-07-17T08:44:13.1001049Z + cd test 2025-07-17T08:44:13.1001764Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-07-17T08:44:14.5561321Z + [[ inductor == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-07-17T08:44:14.5561919Z + [[ inductor == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-07-17T08:44:14.5562506Z + [[ inductor == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-07-17T08:44:14.5570286Z + DYNAMO_BENCHMARK_FLAGS=() 2025-07-17T08:44:14.5570991Z + [[ inductor == *pr_time_benchmarks* ]] 2025-07-17T08:44:14.5571594Z + [[ inductor == *dynamo_eager* ]] 2025-07-17T08:44:14.5572128Z + [[ inductor == *aot_eager* ]] 2025-07-17T08:44:14.5572690Z + [[ inductor == *aot_inductor* ]] 2025-07-17T08:44:14.5573217Z + [[ inductor == *max_autotune_inductor* ]] 2025-07-17T08:44:14.5573783Z + [[ inductor == *inductor* ]] 2025-07-17T08:44:14.5574278Z + [[ inductor != *perf* ]] 2025-07-17T08:44:14.5574822Z + DYNAMO_BENCHMARK_FLAGS+=(--inductor) 2025-07-17T08:44:14.5575381Z + [[ inductor == *dynamic* ]] 2025-07-17T08:44:14.5575857Z + [[ inductor == *cpu* ]] 2025-07-17T08:44:14.5576351Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-07-17T08:44:14.5615542Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-07-17T08:44:14.5616333Z + [[ linux-jammy-rocm-py3.10 == *-bazel-* ]] 2025-07-17T08:44:14.5621252Z + cd test 2025-07-17T08:44:14.5621841Z + python -c 'import torch; print(torch.__config__.show())' 2025-07-17T08:44:15.7133742Z PyTorch built with: 2025-07-17T08:44:15.7133982Z - GCC 11.4 2025-07-17T08:44:15.7134176Z - C++ Version: 201703 2025-07-17T08:44:15.7134606Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-07-17T08:44:15.7135192Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-07-17T08:44:15.7135515Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-07-17T08:44:15.7135773Z - LAPACK is enabled (usually provided by MKL) 2025-07-17T08:44:15.7136020Z - NNPACK is enabled 2025-07-17T08:44:15.7136224Z - CPU capability usage: AVX2 2025-07-17T08:44:15.7136440Z - HIP Runtime 6.4.43483 2025-07-17T08:44:15.7136632Z - MIOpen 3.4.0 2025-07-17T08:44:15.7136797Z - Magma 2.7.2 2025-07-17T08:44:15.7139931Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=a38f433be2e94a64b095a44ba39879d02d0c2316, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.9.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-07-17T08:44:15.7143550Z 2025-07-17T08:44:16.0022580Z + cd test 2025-07-17T08:44:16.0023234Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-07-17T08:44:17.0671269Z ATen/Parallel: 2025-07-17T08:44:17.0671957Z at::get_num_threads() : 64 2025-07-17T08:44:17.0672531Z at::get_num_interop_threads() : 64 2025-07-17T08:44:17.0673168Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-07-17T08:44:17.0673694Z omp_get_max_threads() : 64 2025-07-17T08:44:17.0674690Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-07-17T08:44:17.0675731Z mkl_get_max_threads() : 64 2025-07-17T08:44:17.0676411Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-07-17T08:44:17.0677191Z std::thread::hardware_concurrency() : 64 2025-07-17T08:44:17.0677740Z Environment variables: 2025-07-17T08:44:17.0678228Z OMP_NUM_THREADS : [not set] 2025-07-17T08:44:17.0678705Z MKL_NUM_THREADS : [not set] 2025-07-17T08:44:17.0679191Z ATen parallel backend: OpenMP 2025-07-17T08:44:17.0679835Z 2025-07-17T08:44:17.4477290Z + [[ inductor == *numpy_2* ]] 2025-07-17T08:44:17.4478005Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-07-17T08:44:17.4478644Z + [[ inductor == *backward* ]] 2025-07-17T08:44:17.4479150Z + [[ inductor == *xla* ]] 2025-07-17T08:44:17.4479744Z + [[ inductor == *executorch* ]] 2025-07-17T08:44:17.4480265Z + [[ inductor == \j\i\t\_\l\e\g\a\c\y ]] 2025-07-17T08:44:17.4480856Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-07-17T08:44:17.4481439Z + [[ inductor == distributed ]] 2025-07-17T08:44:17.4481954Z + [[ inductor == *operator_benchmark* ]] 2025-07-17T08:44:17.4482515Z + [[ inductor == *inductor_distributed* ]] 2025-07-17T08:44:17.4483086Z + [[ inductor == *inductor-halide* ]] 2025-07-17T08:44:17.4483662Z + [[ inductor == *inductor-triton-cpu* ]] 2025-07-17T08:44:17.4484891Z + [[ inductor == *inductor-micro-benchmark* ]] 2025-07-17T08:44:17.4485519Z + [[ inductor == *huggingface* ]] 2025-07-17T08:44:17.4486016Z + [[ inductor == *timm* ]] 2025-07-17T08:44:17.4486475Z + [[ inductor == cachebench ]] 2025-07-17T08:44:17.4486997Z + [[ inductor == verify_cachebench ]] 2025-07-17T08:44:17.4487532Z + [[ inductor == *torchbench* ]] 2025-07-17T08:44:17.4488052Z + [[ inductor == *inductor_cpp_wrapper* ]] 2025-07-17T08:44:17.4488613Z + [[ inductor == *inductor* ]] 2025-07-17T08:44:17.4489031Z + install_torchvision 2025-07-17T08:44:17.4489269Z + local orig_preload 2025-07-17T08:44:17.4489482Z + local commit 2025-07-17T08:44:17.4489711Z ++ get_pinned_commit vision 2025-07-17T08:44:17.4489982Z ++ cat .github/ci_commit_pins/vision.txt 2025-07-17T08:44:17.4506240Z + commit=966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-07-17T08:44:17.4506977Z + orig_preload= 2025-07-17T08:44:17.4507407Z + '[' -n '' ']' 2025-07-17T08:44:17.4508420Z + pip_install --no-use-pep517 git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-07-17T08:44:17.4509629Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-07-17T08:44:17.4511057Z + python3 -m pip install --progress-bar off --no-use-pep517 git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-07-17T08:44:17.7483658Z Collecting git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-07-17T08:44:17.7487714Z Cloning https://github.com/pytorch/vision.git (to revision 966da7e46f65d6d49df3e31214470a4fe5cc8e66) to /tmp/pip-req-build-tcqc6w15 2025-07-17T08:44:17.7516565Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-tcqc6w15 2025-07-17T08:44:22.4173026Z Running command git rev-parse -q --verify 'sha^966da7e46f65d6d49df3e31214470a4fe5cc8e66' 2025-07-17T08:44:22.4202410Z Running command git fetch -q https://github.com/pytorch/vision.git 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-07-17T08:44:22.8767617Z Running command git checkout -q 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-07-17T08:44:23.5835476Z Resolved https://github.com/pytorch/vision.git to commit 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-07-17T08:44:25.9257497Z Preparing metadata (setup.py) ... [?25l- \ | / done 2025-07-17T08:44:25.9315954Z [?25hRequirement already satisfied: numpy in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.22.0a0+966da7e) (1.22.4) 2025-07-17T08:44:25.9328269Z Requirement already satisfied: torch in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.22.0a0+966da7e) (2.9.0a0+gita38f433) 2025-07-17T08:44:25.9334442Z Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.22.0a0+966da7e) (11.0.0) 2025-07-17T08:44:25.9444945Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.22.0a0+966da7e) (3.18.0) 2025-07-17T08:44:25.9455312Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.22.0a0+966da7e) (4.14.1) 2025-07-17T08:44:25.9464234Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.22.0a0+966da7e) (1.13.3) 2025-07-17T08:44:25.9467521Z Requirement already satisfied: networkx in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.22.0a0+966da7e) (2.8.8) 2025-07-17T08:44:25.9474920Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.22.0a0+966da7e) (3.1.6) 2025-07-17T08:44:25.9481510Z Requirement already satisfied: fsspec in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.22.0a0+966da7e) (2025.5.1) 2025-07-17T08:44:25.9508289Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch->torchvision==0.22.0a0+966da7e) (1.3.0) 2025-07-17T08:44:26.0028538Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch->torchvision==0.22.0a0+966da7e) (3.0.2) 2025-07-17T08:44:26.0068858Z Building wheels for collected packages: torchvision 2025-07-17T08:44:26.0153994Z  DEPRECATION: Building 'torchvision' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torchvision'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-07-17T08:45:16.8414098Z  Building wheel for torchvision (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / done 2025-07-17T08:45:16.8453234Z [?25h Created wheel for torchvision: filename=torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl size=1570340 sha256=22616d36bb12ab031ab24db13a6aa9092e37d81c84b9a6b2ab1118d42255b234 2025-07-17T08:45:16.8456696Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/9c/9d/3e/42fa2d5ac6ba44a90363f8fff0fa9e712e24d4f977637c81cb 2025-07-17T08:45:16.8509253Z Successfully built torchvision 2025-07-17T08:45:17.2911095Z Installing collected packages: torchvision 2025-07-17T08:45:17.6446797Z Successfully installed torchvision-0.22.0a0+966da7e 2025-07-17T08:45:17.7663271Z + '[' -n '' ']' 2025-07-17T08:45:17.7663793Z + test_inductor_shard 2 2025-07-17T08:45:17.7664251Z + [[ -z 2 ]] 2025-07-17T08:45:17.7664693Z + python tools/dynamo/verify_dynamo.py 2025-07-17T08:45:19.3818197Z Python version: 3.10.18 2025-07-17T08:45:19.3818533Z `torch` version: 2.9.0a0+gita38f433 2025-07-17T08:45:19.3818814Z CUDA version: None 2025-07-17T08:45:19.3819068Z ROCM version: 6.4 2025-07-17T08:45:19.3819197Z 2025-07-17T08:45:29.8230892Z All required checks passed 2025-07-17T08:45:31.3982261Z + python test/run_test.py --inductor --include test_modules test_ops test_ops_gradients test_torch --shard 2 2 --verbose 2025-07-17T08:45:33.9112266Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-07-17T08:45:33.9114816Z import pkg_resources 2025-07-17T08:45:34.3264897Z No TD results found 2025-07-17T08:45:34.3265630Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-07-17T08:45:34.7460967Z Ignoring disabled issues: [''] 2025-07-17T08:45:34.7934259Z Found test times from artifacts 2025-07-17T08:45:34.8821279Z Found test times from artifacts 2025-07-17T08:45:34.8853922Z Running all tests 2025-07-17T08:45:34.8861530Z Running parallel tests on 2 processes 2025-07-17T08:45:34.8862239Z Name: tests to run (est. time: 94.48min) 2025-07-17T08:45:34.8862777Z Serial tests (0): 2025-07-17T08:45:34.8863223Z Parallel tests (19): 2025-07-17T08:45:34.8863684Z test_modules 1/8 2025-07-17T08:45:34.8864126Z test_modules 2/8 2025-07-17T08:45:34.8864579Z test_modules 3/8 2025-07-17T08:45:34.8864964Z test_modules 4/8 2025-07-17T08:45:34.8865345Z test_modules 5/8 2025-07-17T08:45:34.8865721Z test_modules 6/8 2025-07-17T08:45:34.8866096Z test_modules 7/8 2025-07-17T08:45:34.8866488Z test_ops 1/20 2025-07-17T08:45:34.8866889Z test_ops 2/20 2025-07-17T08:45:34.8867268Z test_ops 5/20 2025-07-17T08:45:34.8867646Z test_ops 6/20 2025-07-17T08:45:34.8868027Z test_ops 9/20 2025-07-17T08:45:34.8868405Z test_ops 10/20 2025-07-17T08:45:34.8868801Z test_ops 13/20 2025-07-17T08:45:34.8869952Z test_ops 14/20 2025-07-17T08:45:34.8870364Z test_ops 17/20 2025-07-17T08:45:34.8870732Z test_ops 18/20 2025-07-17T08:45:34.8871136Z test_ops_gradients 1/3 2025-07-17T08:45:34.8871601Z test_ops_gradients 2/3 2025-07-17T08:45:34.8872079Z Name: excluded (est. time: 0.0min) 2025-07-17T08:45:34.8872579Z Serial tests (0): 2025-07-17T08:45:34.8872993Z Parallel tests (0): 2025-07-17T08:45:34.8873549Z Running test_modules 1/8 ... [2025-07-17 08:45:34.886184] 2025-07-17T08:45:34.8874229Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:45:34.8876003Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'serial', '--shard-id=1', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:45:34.886501] 2025-07-17T08:45:40.0128164Z 2025-07-17T08:45:40.0129675Z test_modules 1/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_1.8_70f8508a45caa97f_.log 2025-07-17T08:45:40.0130580Z Running 0 items in this shard: 2025-07-17T08:45:40.0130784Z 2025-07-17T08:45:40.0131063Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-07-17T08:45:40.0131519Z Uploading artifacts took 0.00 seconds 2025-07-17T08:45:40.0136447Z Running test_modules 2/8 ... [2025-07-17 08:45:40.012645] 2025-07-17T08:45:40.0136855Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:45:40.0138371Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'serial', '--shard-id=2', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:45:40.013077] 2025-07-17T08:45:45.1893199Z 2025-07-17T08:45:45.1894050Z test_modules 2/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_2.8_da0da44ee4b20984_.log 2025-07-17T08:45:45.1894724Z Running 0 items in this shard: 2025-07-17T08:45:45.1894909Z 2025-07-17T08:45:45.1895087Z Running test_modules 3/8 ... [2025-07-17 08:45:45.189266] 2025-07-17T08:45:45.1895453Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:45:45.1899405Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'serial', '--shard-id=3', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:45:45.189621] 2025-07-17T08:45:50.3158366Z 2025-07-17T08:45:50.3160347Z test_modules 3/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_3.8_0f354499f2dfc7e1_.log 2025-07-17T08:45:50.3161828Z Running 0 items in this shard: 2025-07-17T08:45:50.3162228Z 2025-07-17T08:45:50.3162588Z Running test_modules 4/8 ... [2025-07-17 08:45:50.315643] 2025-07-17T08:45:50.3163380Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:45:50.3172914Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'serial', '--shard-id=4', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:45:50.316258] 2025-07-17T08:45:55.4931324Z 2025-07-17T08:45:55.4933380Z test_modules 4/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_4.8_817eb84a93fd3dc0_.log 2025-07-17T08:45:55.4934987Z Running 0 items in this shard: 2025-07-17T08:45:55.4935404Z 2025-07-17T08:45:55.4935742Z Running test_modules 5/8 ... [2025-07-17 08:45:55.492969] 2025-07-17T08:45:55.4936237Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:45:55.4939341Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'serial', '--shard-id=5', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:45:55.493589] 2025-07-17T08:46:00.5701076Z 2025-07-17T08:46:00.5703440Z test_modules 5/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_5.8_8b65bf806f85c155_.log 2025-07-17T08:46:00.5704153Z Running 0 items in this shard: 2025-07-17T08:46:00.5704326Z 2025-07-17T08:46:00.5704494Z Running test_modules 6/8 ... [2025-07-17 08:46:00.569916] 2025-07-17T08:46:00.5704850Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:46:00.5707881Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'serial', '--shard-id=6', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:46:00.570488] 2025-07-17T08:46:05.7978343Z 2025-07-17T08:46:05.7979728Z test_modules 6/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_6.8_0d00e3455b619fa9_.log 2025-07-17T08:46:05.7980789Z Running 0 items in this shard: 2025-07-17T08:46:05.7981080Z 2025-07-17T08:46:05.7981378Z Running test_modules 7/8 ... [2025-07-17 08:46:05.797663] 2025-07-17T08:46:05.7981953Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:46:05.7988632Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'serial', '--shard-id=7', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:46:05.798273] 2025-07-17T08:46:10.9756816Z 2025-07-17T08:46:10.9758362Z test_modules 7/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_7.8_d8d9f80e99cf0ce1_.log 2025-07-17T08:46:10.9759716Z Running 0 items in this shard: 2025-07-17T08:46:10.9760096Z 2025-07-17T08:46:10.9760419Z Running test_ops 1/20 ... [2025-07-17 08:46:10.975473] 2025-07-17T08:46:10.9761094Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:46:10.9766862Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:46:10.976053] 2025-07-17T08:46:24.8723633Z 2025-07-17T08:46:24.8724850Z test_ops 1/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_1.20_f4cb32285ac5c472_.log 2025-07-17T08:46:24.8725784Z Running 0 items in this shard: 2025-07-17T08:46:24.8726094Z 2025-07-17T08:46:24.8731527Z Running test_ops 2/20 ... [2025-07-17 08:46:24.872042] 2025-07-17T08:46:24.8732207Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:46:24.8733905Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=2', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:46:24.872637] 2025-07-17T08:46:38.9194094Z 2025-07-17T08:46:38.9195155Z test_ops 2/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_2.20_c479df788ca6cd6e_.log 2025-07-17T08:46:38.9196129Z Running 0 items in this shard: 2025-07-17T08:46:38.9196427Z 2025-07-17T08:46:38.9196666Z Running test_ops 5/20 ... [2025-07-17 08:46:38.919286] 2025-07-17T08:46:38.9197363Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:46:38.9207886Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=5', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:46:38.919827] 2025-07-17T08:46:52.6167479Z 2025-07-17T08:46:52.6168709Z test_ops 5/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_5.20_5558d3bfa2afe709_.log 2025-07-17T08:46:52.6169720Z Running 0 items in this shard: 2025-07-17T08:46:52.6169966Z 2025-07-17T08:46:52.6170079Z Running test_ops 6/20 ... [2025-07-17 08:46:52.616305] 2025-07-17T08:46:52.6170765Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:46:52.6171881Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=6', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:46:52.616860] 2025-07-17T08:47:06.4133066Z 2025-07-17T08:47:06.4134832Z test_ops 6/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_6.20_1a23db2be4eaaeae_.log 2025-07-17T08:47:06.4136299Z Running 0 items in this shard: 2025-07-17T08:47:06.4136711Z 2025-07-17T08:47:06.4137056Z Running test_ops 9/20 ... [2025-07-17 08:47:06.413104] 2025-07-17T08:47:06.4137859Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:47:06.4140979Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=9', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:47:06.413682] 2025-07-17T08:47:20.1596103Z 2025-07-17T08:47:20.1597347Z test_ops 9/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_9.20_36564552e58b6fec_.log 2025-07-17T08:47:20.1598504Z Running 0 items in this shard: 2025-07-17T08:47:20.1598852Z 2025-07-17T08:47:20.1602857Z Running test_ops 10/20 ... [2025-07-17 08:47:20.159403] 2025-07-17T08:47:20.1604474Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:47:20.1607439Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=10', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:47:20.160020] 2025-07-17T08:47:34.0568248Z 2025-07-17T08:47:34.0569536Z test_ops 10/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_10.20_06168ccd078b0b87_.log 2025-07-17T08:47:34.0570577Z Running 0 items in this shard: 2025-07-17T08:47:34.0570860Z 2025-07-17T08:47:34.0571096Z Running test_ops 13/20 ... [2025-07-17 08:47:34.056522] 2025-07-17T08:47:34.0571649Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:47:34.0573876Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=13', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:47:34.056983] 2025-07-17T08:47:47.8021880Z 2025-07-17T08:47:47.8026954Z test_ops 13/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_13.20_a28dc1b338a11450_.log 2025-07-17T08:47:47.8028137Z Running 0 items in this shard: 2025-07-17T08:47:47.8028496Z 2025-07-17T08:47:47.8032985Z Running test_ops 14/20 ... [2025-07-17 08:47:47.801785] 2025-07-17T08:47:47.8033908Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:47:47.8035357Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=14', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:47:47.802110] 2025-07-17T08:48:01.5967065Z 2025-07-17T08:48:01.5968347Z test_ops 14/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_14.20_9bf3d32347aa0f90_.log 2025-07-17T08:48:01.5969418Z Running 0 items in this shard: 2025-07-17T08:48:01.5969733Z 2025-07-17T08:48:01.5969965Z Running test_ops 17/20 ... [2025-07-17 08:48:01.596501] 2025-07-17T08:48:01.5970590Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:48:01.5976899Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=17', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:48:01.597060] 2025-07-17T08:48:15.4933707Z 2025-07-17T08:48:15.4934944Z test_ops 17/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_17.20_215c6afd161c3dd1_.log 2025-07-17T08:48:15.4936175Z Running 0 items in this shard: 2025-07-17T08:48:15.4936524Z 2025-07-17T08:48:15.4936784Z Running test_ops 18/20 ... [2025-07-17 08:48:15.493038] 2025-07-17T08:48:15.4937440Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:48:15.4940359Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=18', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:48:15.493636] 2025-07-17T08:48:29.3396169Z 2025-07-17T08:48:29.3397472Z test_ops 18/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_18.20_358cfd3b0c49e03f_.log 2025-07-17T08:48:29.3398647Z Running 0 items in this shard: 2025-07-17T08:48:29.3399099Z 2025-07-17T08:48:29.3400287Z Running test_ops_gradients 1/3 ... [2025-07-17 08:48:29.339363] 2025-07-17T08:48:29.3401057Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:48:29.3405805Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'serial', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:48:29.339932] 2025-07-17T08:48:34.8671612Z 2025-07-17T08:48:34.8672950Z test_ops_gradients 1/3 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_1.3_6d24a48c14313266_.log 2025-07-17T08:48:34.8674109Z Running 0 items in this shard: 2025-07-17T08:48:34.8674401Z 2025-07-17T08:48:34.8674687Z Running test_ops_gradients 2/3 ... [2025-07-17 08:48:34.866870] 2025-07-17T08:48:34.8675282Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:48:34.8680052Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'serial', '--shard-id=2', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:48:34.867349] 2025-07-17T08:48:40.3952213Z 2025-07-17T08:48:40.3953811Z test_ops_gradients 2/3 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_2.3_bc2c60e882daf913_.log 2025-07-17T08:48:40.3955346Z Running 0 items in this shard: 2025-07-17T08:48:40.3955776Z 2025-07-17T08:48:43.0107566Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-07-17T08:48:43.0110073Z import pkg_resources 2025-07-17T08:48:43.0807553Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-07-17T08:48:43.0810037Z import pkg_resources 2025-07-17T08:48:43.2167115Z Running test_modules 1/8 ... [2025-07-17 08:48:43.216179] 2025-07-17T08:48:43.2167999Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:48:43.2170375Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'not serial', '--shard-id=1', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:48:43.216492] 2025-07-17T08:48:43.3094578Z Running test_modules 2/8 ... [2025-07-17 08:48:43.308956] 2025-07-17T08:48:43.3094972Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:48:43.3097509Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'not serial', '--shard-id=2', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:48:43.309342] 2025-07-17T08:56:39.1073042Z 2025-07-17T08:56:39.1073773Z test_modules 1/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_1.8_46f6e9a5a9947cc3_.log 2025-07-17T08:56:39.1187280Z Running 433 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm3d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRUCell_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardshrink_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm3d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTMCell_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTM_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Linear_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSigmoid_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MSELoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiheadAttention_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RMSNorm_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RNN_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SiLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SoftMarginLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmax2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softsign_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoderLayer_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCELoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Embedding_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardswish_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LocalResponseNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSoftmax_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSoftmax_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_NLLLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RMSNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SoftMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanhshrink_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_True_set_grad_False_cuda_float32 2025-07-17T08:56:39.1299191Z 2025-07-17T08:56:39.1299325Z Running test_modules 3/8 ... [2025-07-17 08:56:39.108790] 2025-07-17T08:56:39.1299618Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:56:39.1300360Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'not serial', '--shard-id=3', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:56:39.109403] 2025-07-17T08:57:10.2556628Z 2025-07-17T08:57:10.2557579Z test_modules 2/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_2.8_a20b81331821a929_.log 2025-07-17T08:57:10.2680887Z Running 462 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GRU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_RNN_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm1d_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm2d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CTCLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CrossEntropyLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRUCell_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GroupNorm_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardswish_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LayerNorm_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LocalResponseNorm_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_PoissonNLLLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RNN_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SELU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SoftMarginLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Threshold_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerDecoderLayer_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoderLayer_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Bilinear_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardswish_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardtanh_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_KLDivLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_KLDivLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Linear_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSigmoid_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_NLLLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PoissonNLLLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PoissonNLLLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RMSNorm_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNNCell_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU6_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SmoothL1Loss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SmoothL1Loss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SoftMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softplus_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softplus_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softsign_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanh_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanhshrink_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Threshold_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad1d_swap_True_set_grad_False_cuda_float32 2025-07-17T08:57:10.2792672Z 2025-07-17T08:57:10.2792807Z Running test_modules 4/8 ... [2025-07-17 08:57:10.256518] 2025-07-17T08:57:10.2793112Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T08:57:10.2793892Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'not serial', '--shard-id=4', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 08:57:10.256856] 2025-07-17T09:02:11.3180065Z 2025-07-17T09:02:11.3180824Z test_modules 3/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_3.8_cfd3691ad8d99de7_.log 2025-07-17T09:02:11.3286451Z Running 406 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_forward_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_repr_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm1d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm1d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm1d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm2d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm2d_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm3d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CELU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CELU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CTCLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ELU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GLU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GroupNorm_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_HingeEmbeddingLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm3d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm3d_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LeakyReLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiLabelMarginLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_NLLLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmax_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softplus_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoderLayer_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Bilinear_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CrossEntropyLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CrossEntropyLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HingeEmbeddingLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HingeEmbeddingLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_L1Loss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LayerNorm_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU6_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SmoothL1Loss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softshrink_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanhshrink_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad1d_swap_True_set_grad_True_cuda_float32 2025-07-17T09:02:11.3392847Z 2025-07-17T09:02:11.3392976Z Running test_modules 5/8 ... [2025-07-17 09:02:11.319212] 2025-07-17T09:02:11.3393253Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:02:11.3393992Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'not serial', '--shard-id=5', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:02:11.319835] 2025-07-17T09:02:16.8255789Z 2025-07-17T09:02:16.8256553Z test_modules 4/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_4.8_ff270159cd2c017a_.log 2025-07-17T09:02:16.8373487Z Running 479 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LSTM_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_RNN_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerEncoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_repr_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_save_load_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BCEWithLogitsLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Embedding_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRU_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardtanh_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_HuberLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm2d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm2d_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_L1Loss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTMCell_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTM_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Mish_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiLabelSoftMarginLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiheadAttention_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReLU6_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReLU6_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SELU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmax2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmin_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanhshrink_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerDecoderLayer_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoderLayer_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Transformer_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Embedding_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Embedding_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardtanh_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_L1Loss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_L1Loss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LayerNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LeakyReLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LocalResponseNorm_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SiLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SiLU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softshrink_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanh_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Threshold_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad2d_swap_False_set_grad_True_cuda_float32 2025-07-17T09:02:16.8495010Z 2025-07-17T09:02:16.8495144Z Running test_modules 6/8 ... [2025-07-17 09:02:16.826163] 2025-07-17T09:02:16.8495431Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:02:16.8496169Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'not serial', '--shard-id=6', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:02:16.826539] 2025-07-17T09:07:45.3740188Z 2025-07-17T09:07:45.3744092Z test_modules 5/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_5.8_7e098a67580781fc_.log 2025-07-17T09:07:45.3886546Z Running 475 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LSTM_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerEncoder_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GELU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardshrink_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardtanh_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm2d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_KLDivLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTM_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LocalResponseNorm_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSoftmax_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSoftmax_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MarginRankingLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiLabelMarginLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiMarginLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Sigmoid_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Sigmoid_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SmoothL1Loss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SmoothL1Loss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softsign_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanh_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanhshrink_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Threshold_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Bilinear_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardswish_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardtanh_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_L1Loss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Linear_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MarginRankingLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNNCell_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softplus_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Transformer_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Transformer_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad3d_swap_False_set_grad_True_cuda_float32 2025-07-17T09:07:45.4027906Z 2025-07-17T09:07:45.4028077Z Running test_modules 7/8 ... [2025-07-17 09:07:45.374162] 2025-07-17T09:07:45.4028434Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:07:45.4028862Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-07-17T09:07:45.4029301Z Uploading artifacts took 0.00 seconds 2025-07-17T09:07:45.4030230Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '-m', 'not serial', '--shard-id=7', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:07:45.374448] 2025-07-17T09:08:14.7212481Z 2025-07-17T09:08:14.7213189Z test_modules 6/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_6.8_dceeec05c9c5ee11_.log 2025-07-17T09:08:14.7321543Z Running 441 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_forward_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GRU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiheadAttention_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerEncoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerEncoder_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BCELoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BCELoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ELU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRU_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_HingeEmbeddingLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm2d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_L1Loss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LayerNorm_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Linear_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSigmoid_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiheadAttention_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RNNCell_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReLU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SiLU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmin_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softplus_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softshrink_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanh_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCELoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Bilinear_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CrossEntropyLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_KLDivLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTMCell_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LeakyReLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LeakyReLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Linear_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSigmoid_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSoftmax_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MarginRankingLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MarginRankingLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_NLLLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNNCell_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SiLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SoftMarginLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softplus_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softshrink_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softshrink_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softsign_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanhshrink_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Threshold_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Transformer_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad3d_swap_True_set_grad_True_cuda_float32 2025-07-17T09:08:14.7426614Z 2025-07-17T09:08:14.7426733Z Running test_ops 1/20 ... [2025-07-17 09:08:14.721746] 2025-07-17T09:08:14.7427008Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:08:14.7427755Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:08:14.722046] 2025-07-17T09:11:50.2691975Z 2025-07-17T09:11:50.2692969Z test_modules 7/8 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_7.8_bbe0db4817f1ae24_.log 2025-07-17T09:11:50.2810537Z Running 482 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_forward_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiheadAttention_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm2d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Bilinear_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CosineEmbeddingLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CrossEntropyLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GELU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRU_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardswish_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LeakyReLU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MSELoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Mish_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiheadAttention_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_NLLLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RNN_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softshrink_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Embedding_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardtanh_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HingeEmbeddingLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_KLDivLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTMCell_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LeakyReLU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LocalResponseNorm_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSigmoid_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSigmoid_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelMarginLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PoissonNLLLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PoissonNLLLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RMSNorm_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNNCell_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SmoothL1Loss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SoftMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softsign_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softsign_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanh_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Threshold_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_False_set_grad_False_cuda_float32 2025-07-17T09:11:50.2925782Z 2025-07-17T09:11:50.2925909Z Running test_ops 2/20 ... [2025-07-17 09:11:50.269466] 2025-07-17T09:11:50.2926190Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:11:50.2926932Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=2', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:11:50.269798] 2025-07-17T09:20:34.6016294Z 2025-07-17T09:20:34.6017507Z test_ops 1/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_1.20_2ea386b77d2c74e6_.log 2025-07-17T09:20:34.6781947Z Running 1710 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing___getitem___cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_long_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nanmean_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes___radd___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_double_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_lgamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_equal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_frac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eig_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_binary_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv_transpose3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_bicubic_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_silu_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_inf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_quantile_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scalar_tensor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_general_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_square_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zero__cuda, test/test_ops.py::TestCommonCUDA::test_errors_dot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_combinations_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_h_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_consecutive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_argwhere_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_combinations_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_gather_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_max_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_min_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_min_reduction_no_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_min_reduction_with_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_outer_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_put_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_y1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_transpose_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmod___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_matrix_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_outer_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pca_lowrank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_4inputs_with_extra_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_stft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmm_decomposed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_geqrf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigvalsh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_rank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_pinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_pool2d_with_indices_backward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_native_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_fractional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool2d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resize_as__cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_slice_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_y0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_hermite_polynomial_h_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_transpose_copy_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_polar_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_alpha_dropout_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___getitem___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_corrcoef_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ormqr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_list_args_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_alias_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_column_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_item_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_or_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reciprocal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_t_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_true_divide_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_kron_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lerp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eig_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_logsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_pinverse_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resolve_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tensor_split_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_short_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_contiguous_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logsumexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_full_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sinc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tanh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_trace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_angle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_einsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_kron_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_ldl_factor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_ldl_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_randn_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_list_args_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_to_sparse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_view_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_any_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_exp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_flip_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_round_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unfold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_decomposed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_arange_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_broadcast_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_fft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_flipud_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_hsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_inv_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logdet_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mT_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_reduction_no_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mode_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_full_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv_transpose1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_grid_sample_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_linear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_circular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_quantile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resize__cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resolve_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rsub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_bartlett_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_slice_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_j1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_laguerre_polynomial_l_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_multiple_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trapezoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zero__cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_heaviside_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cauchy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_j1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randint_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_complex32, test/test_ops.py::TestTagsCUDA::test_tags___rsub___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_short_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_chunk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_max_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_conj_physical_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_istft_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags__refs_log_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logaddexp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_not_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_round_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_select_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_xlogy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argsort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bfloat16_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_chunk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_eq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_erfc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_rfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_rfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isreal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ldexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lgamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logaddexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_argmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_std_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_multinomial_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_narrow_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ne_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nextafter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_celu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_linear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pinverse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_y1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_log_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trunc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zero__cuda_float32 2025-07-17T09:20:34.7158741Z 2025-07-17T09:20:34.7158864Z Running test_ops 5/20 ... [2025-07-17 09:20:34.620887] 2025-07-17T09:20:34.7159150Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:20:34.7159942Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=5', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:20:34.621226] 2025-07-17T09:22:55.2700217Z 2025-07-17T09:22:55.2701465Z test_ops 2/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_2.20_e218ab13ee73fd8e_.log 2025-07-17T09:22:55.3133520Z Running 1647 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addbmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmm_decomposed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_imag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cond_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvals_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_inv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanmedian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nansum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_batch_norm_without_cudnn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_logsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multi_head_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_constant_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_replicate_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_airy_ai_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_scaled_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch__scaled_mm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trapz_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unique_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_errors_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ge_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_errors_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_interleave_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___radd___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_out__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_warning_H_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cov_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_double_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_eq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cholesky_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_inf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_slice_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_square_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unravel_index_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e5m2fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rpow___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_einsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_fro_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapz_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_mean_unbiased_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rmul___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_acos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_broadcast_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tril_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_column_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ldexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_std_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_pow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rpow___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_half_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_conj_physical_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isfinite_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diag_embed_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_hfftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_renorm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsqueeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view_T_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___radd___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rsub___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_byte_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_all_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_flatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_permute_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_select_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cauchy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cummax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expm1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_exponential_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_histc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_householder_product_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_long_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nan_to_num_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanquantile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nansum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_zeros_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardswish_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randint_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tensor_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trapz_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_mean_unbiased_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_randn_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signbit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_xor_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float16, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_int_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_alias_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_arange_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_block_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_column_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_digamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exponential_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fmod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_permute_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_roll_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_entr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_triu_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_unflatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atan2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_cartesian_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagflat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_div_trunc_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_floor_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_not_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lu_unpack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_median_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_movedim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_dropout_backward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nonzero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_real_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsqueeze_copy_cuda_float32 2025-07-17T09:22:55.3502506Z 2025-07-17T09:22:55.3502631Z Running test_ops 6/20 ... [2025-07-17 09:22:55.284307] 2025-07-17T09:22:55.3502920Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:22:55.3503805Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=6', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:22:55.284627] 2025-07-17T09:29:38.6518130Z 2025-07-17T09:29:38.6522293Z test_ops 5/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_5.20_e88a9f0674a1b7f3_.log 2025-07-17T09:29:38.7099627Z Running 1669 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing__chunk_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_irfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mH_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_repeat_interleave_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes___rmod___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cartesian_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_combinations_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_rank_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vander_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_msort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_alpha_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_binary_cross_entropy_with_logits_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool3d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_take_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_to_sparse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tril_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_scatter_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_legendre_polynomial_p_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_histc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_4inputs_with_extra_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___getitem___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rand___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rxor___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_kron_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pad_constant_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_permute_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_squeeze_multiple_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___getitem___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_inner_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eig_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_msort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_exponential_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tile_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_bmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_double_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_le_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__softmax_backward_data_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_2inputs_2outputs_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_det_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_long_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_embedding_bag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_linear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_logsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_one_hot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_scaled_dot_product_attention_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_nuc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_outer_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_roll_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_blackman_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e5m2, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scalar_tensor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_any_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_vsplit_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_short_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_acosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_conj_physical_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_1d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_char_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diff_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_double_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_matmul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ne_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_reflect_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_t_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__chunk_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_flipud_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_ravel_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_reciprocal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rsub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addcmul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_irfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_factor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logical_and_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mH_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pinverse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rand_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_with_sizes_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_with_sizes_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_stft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_long_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_bucketize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hypot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_or_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_minimum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_randn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_tanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__segment_reduce_lengths_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_angle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_copysign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_eye_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_frac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_unary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_native_dropout_backward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_constant_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_relu6_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_pca_lowrank_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randint_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_reshape_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_gaussian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sinc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_as_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cauchy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_uniform_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_heaviside_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zeros_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__chunk_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_right_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_bool, test/test_ops.py::TestTagsCUDA::test_tags___rmod___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_and_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_heaviside_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_igammac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_positive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reciprocal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_transpose_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_view_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcdiv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmm_decomposed_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bucketize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_rfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_item_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lu_factor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logcumsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ones_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_permute_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resolve_neg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unbind_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_vsplit_cuda_float32 2025-07-17T09:29:38.7481017Z 2025-07-17T09:29:38.7481153Z Running test_ops 9/20 ... [2025-07-17 09:29:38.668589] 2025-07-17T09:29:38.7481415Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:29:38.7481741Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-07-17T09:29:38.7482070Z Uploading artifacts took 0.00 seconds 2025-07-17T09:29:38.7482779Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=9', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:29:38.668963] 2025-07-17T09:33:00.8582295Z 2025-07-17T09:33:00.8583727Z test_ops 6/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_6.20_e14ab6e37392398a_.log 2025-07-17T09:33:00.9048169Z Running 1721 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nonzero_static_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ones_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rand_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsafe_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_H_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rpow___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_abs_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_flip_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_alpha_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cummin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_histc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mT_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_grid_sample_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool1d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multilabel_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_silu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_nuc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_y1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_polygamma_special_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_stft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_gaussian_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_errors_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mT_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___radd___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cummin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_int_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_msort_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nonzero_static_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kthvalue_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_embedding_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zero__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_out__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning___rmod___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__chunk_cat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_floor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_imag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal__in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_transpose_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__unsafe_masked_index_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bool_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eig_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_inv_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_unpack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_median_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_min_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_msort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_binary_cross_entropy_with_logits_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_replicate_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_replicate_negative_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pca_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_repeat_interleave_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_to_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_complex_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frac_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_consecutive_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rmatmul___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_hstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_permute_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sigmoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_argwhere_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_corrcoef_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagonal_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_gather_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_factor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_svdvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_narrow_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_repeat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_repeat_interleave_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reshape_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_transpose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unsqueeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_asin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_and_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_xor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cartesian_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_char_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_inv_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logical_or_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_movedim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_circular_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softsign_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_positive_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_repeat_interleave_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sgn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmatmul___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__batch_norm_with_update_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_int_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_allclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atan2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lerp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rsub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_t_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_trunc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_floor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_geqrf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lstsq_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_argmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_argmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_matrix_exp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_maximum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_gelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_kl_div_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_upsample_nearest_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resize_as__cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_exponential_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_take_along_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_copy_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___ror___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_in_place_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lcm_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hann_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_permuted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_or_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float16, test/test_ops.py::TestTagsCUDA::test_tags__refs_asinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clone_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_contiguous_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_copysign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diag_embed_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_frac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isnan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_maximum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtri_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_acosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addbmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_arange_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_asin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_double_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_inner_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_le_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_matrix_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_selu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pca_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randint_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_reciprocal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resize__cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_cosine_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_j1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_take_along_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trapz_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tril_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unbind_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_cuda_float32 2025-07-17T09:33:00.9431797Z 2025-07-17T09:33:00.9431920Z Running test_ops 10/20 ... [2025-07-17 09:33:00.880522] 2025-07-17T09:33:00.9432212Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:33:00.9432942Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=10', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:33:00.881084] 2025-07-17T09:37:43.9706014Z 2025-07-17T09:37:43.9711160Z test_ops 9/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_9.20_f4375b7c7ec4faec_.log 2025-07-17T09:37:44.0482907Z Running 1657 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_transpose_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_arange_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_i0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_roll_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_full_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_binary_return_by_ref_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_matrix_exp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanmean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv_transpose2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_leaky_relu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_reflect_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_pca_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resolve_conj_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_short_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_errors_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_errors_median_cuda, test/test_ops.py::TestCommonCUDA::test_errors_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__chunk_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__unsafe_masked_index_put_accumulate_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nansum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_cosine_embedding_loss_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_neg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_hermite_polynomial_he_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tensor_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cov_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_einsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_constant_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_numpy_ref_addbmm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lgamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_round_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tril_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_upsample_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_decimals_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_general_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_where_cuda, test/test_ops.py::TestCommonCUDA::test_out_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igamma_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_dropout_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pdist_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_permuted_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mH_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softsign_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tile_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_xor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randint_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_movedim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_rsqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_transpose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_bmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_physical_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cov_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_istft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log1p_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_randn_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sigmoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unbind_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_all_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_asinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_eye_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_float_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isnan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_or_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sigmoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_squeeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addmm_decomposed_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_byte_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_hfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isclose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_long_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_sum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ones_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_put_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_randn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scalar_tensor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sinc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sparse_sampled_addmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_contiguous_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diag_embed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_eye_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_remainder_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_signbit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_true_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_vsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_acos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addcmul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_column_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_combinations_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_exp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_item_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_kthvalue_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_ldl_factor_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_factor_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_pool2d_with_indices_backward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_narrow_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_elu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_logsigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_normal_number_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vsplit_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lcm_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ne_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_right_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_floor_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___ror___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resize__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_uniform_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_complex128, test/test_ops.py::TestTagsCUDA::test_tags_T_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___ror___cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_all_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_any_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_real_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rsqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bincount_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_block_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_copysign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_corrcoef_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cos_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diff_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_float_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gather_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ge_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_grid_sampler_2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mT_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_logaddexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanmean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanquantile_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_normalize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_silu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_fro_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_nuc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polar_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rand_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hamming_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_erfcx_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_t_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsafe_chunk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsafe_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_vstack_cuda_float32 2025-07-17T09:37:44.0847005Z 2025-07-17T09:37:44.0847129Z Running test_ops 13/20 ... [2025-07-17 09:37:44.005973] 2025-07-17T09:37:44.0847401Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:37:44.0848123Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=13', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:37:44.006485] 2025-07-17T09:42:08.0472907Z 2025-07-17T09:42:08.0474205Z test_ops 10/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_10.20_f26c64046a85ef2a_.log 2025-07-17T09:42:08.1153309Z Running 1615 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_H_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nansum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_eq_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log1p_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_square_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tril_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_asin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diff_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_householder_product_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_inv_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_multi_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardswish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_linear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_local_response_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softsign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polar_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_hermite_polynomial_h_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_errors_empty_permuted_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_lstsq_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_neg_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_errors_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_take_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_unary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize_as__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_slice_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_out__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning_T_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_char_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_istft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log10_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagflat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_erf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_householder_product_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_matmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_embedding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resolve_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_decimals_0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igammac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_split_with_sizes_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_linear_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_where_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eye_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flipud_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_alias_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_angle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_clone_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_linear_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_list_args_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_zeros_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_double_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_equal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isreal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_permute_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_squeeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_all_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chalf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_inverse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagflat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diff_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isnan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log10_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_fro_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_short_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_mean_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trapezoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trapz_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_where_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_short_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_erfinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_floor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isnan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isneginf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_renorm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rot90_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_entr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_vdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__softmax_backward_data_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_aminmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bucketize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gradient_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isposinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_kron_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cholesky_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_det_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_triangular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_tensorsolve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanmedian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_glu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_silu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_inf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_put_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_reshape_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_allclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bool_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_byte_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_and_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__batch_norm_with_update_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_or_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_cos_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_eye_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isneginf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sgn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unbind_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_abs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_allclose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_aminmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_conj_physical_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_div_floor_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_geometric_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gradient_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isfinite_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lstsq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_elu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_short_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sinc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unique_consecutive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_real_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_view_copy_cuda_float32 2025-07-17T09:42:08.1509178Z 2025-07-17T09:42:08.1509304Z Running test_ops 14/20 ... [2025-07-17 09:42:08.065000] 2025-07-17T09:42:08.1509572Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:42:08.1510294Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=14', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:42:08.065429] 2025-07-17T09:46:49.6126120Z 2025-07-17T09:46:49.6129393Z test_ops 13/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_13.20_9d89b2b7257f6194_.log 2025-07-17T09:46:49.6725944Z Running 1681 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_arange_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bool_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cummax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kron_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_det_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvalsh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_factor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ones_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_general_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_errors___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_errors_bernoulli_cuda, test/test_ops.py::TestCommonCUDA::test_errors_complex_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gather_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_errors_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_pow_cuda, test/test_ops.py::TestCommonCUDA::test_errors_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_errors_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_errors_tril_cuda, test/test_ops.py::TestCommonCUDA::test_errors_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ravel_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_hamming_cuda_float64, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_asin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cummax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_item_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_grid_sample_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_short_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_gaussian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tile_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_trace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addbmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fliplr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_put_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_or_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_inf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ormqr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_prod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_multiple_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tile_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_true_divide_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unflatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_acos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_argwhere_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_contiguous_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_corrcoef_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_flatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_float_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_full_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ldexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvalsh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_matrix_exp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_normalize_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_real_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rot90_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rsub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isposinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_full_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sgn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cross_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cumulative_trapezoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_deg2rad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_irfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_frexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_2inputs_2outputs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_le_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vecdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_reduction_no_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_batch_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_prelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_u_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_topk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsafe_chunk_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resize_as__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_geometric_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isneginf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_multinomial_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bool_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_uint8, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_flipud_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_movedim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_square_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_stft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_transpose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_partial_views_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_xor_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bool_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cauchy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_max_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_i0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_int_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lcm_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vander_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vecdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logdet_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_log_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_normalize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_positive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_qr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i0e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unravel_index_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_zeros_cuda_float32 2025-07-17T09:46:49.7098565Z 2025-07-17T09:46:49.7098686Z Running test_ops 17/20 ... [2025-07-17 09:46:49.636679] 2025-07-17T09:46:49.7098961Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:46:49.7099689Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=17', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:46:49.637248] 2025-07-17T09:50:45.2522992Z 2025-07-17T09:50:45.2524256Z test_ops 14/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_14.20_57aeb02f70226333_.log 2025-07-17T09:50:45.2989155Z Running 1648 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_slice_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bool_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_byte_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_int_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log10_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_std_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__softmax_backward_data_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__upsample_bilinear2d_aa_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_einsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_half_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_ldl_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mvlgamma_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_feature_alpha_dropout_with_train_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_replicate_negative_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_repeat_interleave_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_decimals_neg_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_i0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_errors___ror___cuda, test/test_ops.py::TestCommonCUDA::test_errors_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_errors_eye_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_errors_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_errors_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_errors_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_errors_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rpow___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_median_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_outer_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cartesian_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_put_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ldexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mH_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_softsign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rsub___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rsub___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mT_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_out___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestCommonCUDA::test_out_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_asin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_multigammaln_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_abs_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argwhere_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_histc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_4inputs_with_extra_args_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_rank_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_qr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_upsample_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_put_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randn_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_zero__cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_where_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e5m2fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_slogdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_unbiased_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsafe_chunk_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__chunk_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_contiguous_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_embed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_istft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_randn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_var_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagflat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mT_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_neg_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_positive_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reshape_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_transpose_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trapezoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_char_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_bfloat16_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_column_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cov_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_inv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_multi_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ormqr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pca_lowrank_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_triangular_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_digamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_item_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_zeta_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_permuted_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_inner_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isneginf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cholesky_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cond_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_inv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_ctc_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_rms_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resolve_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_roll_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_neg_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rsqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_blackman_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sparse_sampled_addmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_uniform_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zeros_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_squeeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isreal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_igamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rand___cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags___rmul___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_polar_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_masked_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_remainder_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tensor_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__softmax_backward_data_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_asinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_column_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flip_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isposinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cond_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanmedian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_inf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_number_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_outer_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_remainder_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestTagsCUDA::test_tags_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_complex_cuda_float32 2025-07-17T09:50:45.3365285Z 2025-07-17T09:50:45.3365410Z Running test_ops 18/20 ... [2025-07-17 09:50:45.269286] 2025-07-17T09:50:45.3365673Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:50:45.3366011Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-07-17T09:50:45.3366347Z Uploading artifacts took 0.00 seconds 2025-07-17T09:50:45.3367062Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=18', '--num-shards=20', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:50:45.269673] 2025-07-17T09:55:59.5116876Z 2025-07-17T09:55:59.5118184Z test_ops 17/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_17.20_4bd733b0cf6e562b_.log 2025-07-17T09:55:59.5731748Z Running 1764 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes___rand___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_short_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_equal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argsort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_permuted_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_geqrf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_unary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_linear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_pinverse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_qr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_where_cuda, test/test_ops.py::TestCommonCUDA::test_errors___radd___cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_errors_mean_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_put_accumulate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_return_by_ref_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_aminmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ones_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sgn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_hermite_polynomial_h_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_histc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmedian_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_reflect_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ormqr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diagflat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diff_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_2inputs_2outputs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_searchsorted_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_cosine_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_byte_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_all_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cartesian_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_pinv_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_min_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nanquantile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nansum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_4_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_torch__scaled_mm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_torch_ops_aten__efficient_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_triangular_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unique_consecutive_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_istft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_unbind_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e5m2, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_static_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_sparse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_abs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argsort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_mish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_H_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randn_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_transpose_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isreal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_permute_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_short_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_vdot_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_T_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_float_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_abs_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_clone_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_lerp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_and_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_permute_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_split_with_sizes_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_var_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_count_nonzero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_dsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_dstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_equal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_exp2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_eye_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_full_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logcumsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lu_unpack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mH_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ravel_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rsub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_triangular_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_long_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_alias_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_imag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_narrow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_reshape_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addcdiv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cfloat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_conj_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_count_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumulative_trapezoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_int_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_householder_product_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_hermitian_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_tensorsolve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv3d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_inf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_nuc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_slice_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_abs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_acos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_copysign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expm1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_igamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_igammac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isfinite_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lgamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal__in_place_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_positive_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_i1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_ndtri_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_where_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_xlogy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argsort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_partial_views_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_asin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cartesian_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_max_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_irfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_half_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isfinite_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eigvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_ldl_factor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_rank_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_hermitian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_slogdet_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logaddexp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_native_batch_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ones_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ormqr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_slice_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_zeta_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unfold_copy_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_aminmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bucketize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_int_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_le_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isposinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ne_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_transpose_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_aminmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bucketize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randn_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bool, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_double_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addcdiv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ceil_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_eq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erfinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expand_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_hsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isreal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logaddexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_var_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argwhere_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atleast_1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_multi_dot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_tensorinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_median_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nan_to_num_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_threshold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_slice_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_j0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i1_cuda_float32 2025-07-17T09:55:59.6123955Z 2025-07-17T09:55:59.6124096Z Running test_ops_gradients 1/3 ... [2025-07-17 09:55:59.533833] 2025-07-17T09:55:59.6124394Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:55:59.6125158Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'not serial', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:55:59.534440] 2025-07-17T09:57:48.7032790Z 2025-07-17T09:57:48.7041502Z test_ops 18/20 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_18.20_9809d4aedae439c8_.log 2025-07-17T09:57:48.7647586Z Running 1629 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_T_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_lengths_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_acos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_trilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_upsample_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randn_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_list_args_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_topk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unravel_index_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsafe_split_cuda, test/test_ops.py::TestCommonCUDA::test_errors_arange_cuda, test/test_ops.py::TestCommonCUDA::test_errors_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_errors_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_errors_item_cuda, test/test_ops.py::TestCommonCUDA::test_errors_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_trace_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagflat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_2inputs_2outputs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_consecutive_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_warning___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_equal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_half_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_le_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_triangular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mode_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_area_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_circular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_reflect_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pinverse_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resize__cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matrix_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vstack_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_as_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view___radd___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addcdiv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_xor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_decomposed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_full_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_inner_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_int_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_qr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_fro_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_permute_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reciprocal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_slice_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unflatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmul___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_acosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expm1_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_randn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_istft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_qr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logdet_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize_as__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sigmoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unbind_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zero__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rpow___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_transpose_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_abs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argwhere_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_baddbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_byte_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_floor_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_multi_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_or_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nextafter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_normalize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_renorm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i0e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_with_sizes_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_where_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestTagsCUDA::test_tags___radd___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rpow___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_equal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nan_to_num_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_pow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sum_to_size_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcmul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_digamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_heaviside_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_unary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vector_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rad2deg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hann_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unfold_cuda_float32 2025-07-17T09:57:48.8011967Z 2025-07-17T09:57:48.8012112Z Running test_ops_gradients 2/3 ... [2025-07-17 09:57:48.721393] 2025-07-17T09:57:48.8012413Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T09:57:48.8013461Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'not serial', '--shard-id=2', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 09:57:48.722130] 2025-07-17T10:05:18.5544659Z 2025-07-17T10:05:18.5545994Z test_ops_gradients 2/3 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_2.3_dae71aa3ff6349bc_.log 2025-07-17T10:05:18.6339518Z Running 1794 items in this shard: test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyMulScalarCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_char_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_slice_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_auto_functionalize_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_subgraph_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_triple_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_while_loop_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_auto_functionalize_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_subgraph_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scan_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clamp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resolve_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_istft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_like_cuda_complex128 2025-07-17T10:05:18.6810790Z 2025-07-17T10:08:16.5590412Z 2025-07-17T10:08:16.5592045Z test_ops_gradients 1/3 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_1.3_a13ae3f377f03419_.log 2025-07-17T10:08:16.6171602Z Running 1761 items in this shard: test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyCatCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpySplitCopyWithIntCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyTakeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmatmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ihfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isfinite_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_kron_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_relu6_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_threshold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_gaussian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_to_size_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyCatCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpySplitCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmatmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_combinations_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cond_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ihfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hardtanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rad2deg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_remainder_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_to_sparse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyMulScalarCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpySplitCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_angle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_quant_packed_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isfinite_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logcumsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_map_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_map_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_map_triple_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_threshold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_searchsorted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_gaussian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_while_loop_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyTakeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_angle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_combinations_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ihfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_kron_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_relu6_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pca_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rad2deg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_roll_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_j1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_sparse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ihfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lgamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pca_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_remainder_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_searchsorted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_to_size_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_cuda_float64 2025-07-17T10:08:16.6621065Z 2025-07-17T10:08:20.6851181Z Running test batch 'tests to run' cost 4965.8 seconds 2025-07-17T10:08:22.0958304Z + python test/run_test.py --include inductor/test_torchinductor inductor/test_torchinductor_opinfo inductor/test_aot_inductor --shard 2 2 --verbose 2025-07-17T10:08:24.9505848Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-07-17T10:08:24.9507224Z import pkg_resources 2025-07-17T10:08:25.3505972Z No TD results found 2025-07-17T10:08:25.3507469Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-07-17T10:08:25.3779321Z Found test times from artifacts 2025-07-17T10:08:25.4797913Z Found test times from artifacts 2025-07-17T10:08:25.4831134Z Running all tests 2025-07-17T10:08:25.4837287Z Running parallel tests on 2 processes 2025-07-17T10:08:25.4838022Z Name: tests to run (est. time: 64.27min) 2025-07-17T10:08:25.4838570Z Serial tests (0): 2025-07-17T10:08:25.4839007Z Parallel tests (13): 2025-07-17T10:08:25.4839716Z inductor/test_aot_inductor 2/5 2025-07-17T10:08:25.4840255Z inductor/test_aot_inductor 5/5 2025-07-17T10:08:25.4840780Z inductor/test_torchinductor 1/2 2025-07-17T10:08:25.4841323Z inductor/test_torchinductor 2/2 2025-07-17T10:08:25.4842606Z inductor/test_torchinductor_opinfo 1/17 2025-07-17T10:08:25.4843204Z inductor/test_torchinductor_opinfo 4/17 2025-07-17T10:08:25.4844003Z inductor/test_torchinductor_opinfo 5/17 2025-07-17T10:08:25.4844635Z inductor/test_torchinductor_opinfo 8/17 2025-07-17T10:08:25.4845193Z inductor/test_torchinductor_opinfo 9/17 2025-07-17T10:08:25.4845765Z inductor/test_torchinductor_opinfo 12/17 2025-07-17T10:08:25.4846390Z inductor/test_torchinductor_opinfo 13/17 2025-07-17T10:08:25.4846967Z inductor/test_torchinductor_opinfo 16/17 2025-07-17T10:08:25.4847543Z inductor/test_torchinductor_opinfo 17/17 2025-07-17T10:08:25.4848143Z Name: excluded (est. time: 0.0min) 2025-07-17T10:08:25.4848717Z Serial tests (0): 2025-07-17T10:08:25.4849206Z Parallel tests (0): 2025-07-17T10:08:25.4849953Z Running inductor/test_aot_inductor 2/5 ... [2025-07-17 10:08:25.483668] 2025-07-17T10:08:25.4850842Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:08:25.4853066Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=2', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:08:25.484001] 2025-07-17T10:08:32.3641964Z 2025-07-17T10:08:32.3643445Z inductor/test_aot_inductor 2/5 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_2.5_1de544c419ec835e_.log 2025-07-17T10:08:32.3644674Z Running 0 items in this shard: 2025-07-17T10:08:32.3645532Z 2025-07-17T10:08:32.3645911Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-07-17T10:08:32.3646603Z Uploading artifacts took 0.00 seconds 2025-07-17T10:08:32.3647230Z Running inductor/test_aot_inductor 5/5 ... [2025-07-17 10:08:32.364122] 2025-07-17T10:08:32.3647864Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:08:32.3649442Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=5', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:08:32.364430] 2025-07-17T10:08:38.9443075Z 2025-07-17T10:08:38.9452967Z inductor/test_aot_inductor 5/5 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_5.5_30f0a9b4f12a1eeb_.log 2025-07-17T10:08:38.9454415Z Running 0 items in this shard: 2025-07-17T10:08:38.9454769Z 2025-07-17T10:08:38.9455179Z Running inductor/test_torchinductor 1/2 ... [2025-07-17 10:08:38.943995] 2025-07-17T10:08:38.9455942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:08:38.9457886Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:08:38.944602] 2025-07-17T10:09:10.3300725Z 2025-07-17T10:09:10.3302331Z inductor/test_torchinductor 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_1.2_cabc30e0fcb3e43d_.log 2025-07-17T10:09:10.3309731Z Running 1 items in this shard: test/inductor/test_torchinductor.py::GPUTests::test_large_block_sizes_cuda 2025-07-17T10:09:10.3310659Z 2025-07-17T10:09:10.3311098Z Running inductor/test_torchinductor 2/2 ... [2025-07-17 10:09:10.329814] 2025-07-17T10:09:10.3311871Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:09:10.3313759Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor.py', '-m', 'serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:09:10.330409] 2025-07-17T10:09:16.9603435Z 2025-07-17T10:09:16.9605116Z inductor/test_torchinductor 2/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_2.2_1e26aa8c27458292_.log 2025-07-17T10:09:16.9608049Z Running 0 items in this shard: 2025-07-17T10:09:16.9608477Z 2025-07-17T10:09:16.9608999Z Running inductor/test_torchinductor_opinfo 1/17 ... [2025-07-17 10:09:16.960126] 2025-07-17T10:09:16.9609972Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:09:16.9612626Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=1', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:09:16.960712] 2025-07-17T10:09:25.6451759Z 2025-07-17T10:09:25.6453631Z inductor/test_torchinductor_opinfo 1/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_1.17_20e0098fb02e67f6_.log 2025-07-17T10:09:25.6454918Z Running 0 items in this shard: 2025-07-17T10:09:25.6455234Z 2025-07-17T10:09:25.6455631Z Running inductor/test_torchinductor_opinfo 4/17 ... [2025-07-17 10:09:25.644914] 2025-07-17T10:09:25.6456334Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:09:25.6459212Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=4', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:09:25.645483] 2025-07-17T10:09:34.3803067Z 2025-07-17T10:09:34.3805303Z inductor/test_torchinductor_opinfo 4/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_4.17_aad3527a19d7dba1_.log 2025-07-17T10:09:34.3806840Z Running 0 items in this shard: 2025-07-17T10:09:34.3807185Z 2025-07-17T10:09:34.3819096Z Running inductor/test_torchinductor_opinfo 5/17 ... [2025-07-17 10:09:34.379982] 2025-07-17T10:09:34.3820115Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:09:34.3822120Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=5', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:09:34.380623] 2025-07-17T10:09:43.0148597Z 2025-07-17T10:09:43.0150585Z inductor/test_torchinductor_opinfo 5/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_5.17_8a9dab24edb791d4_.log 2025-07-17T10:09:43.0152222Z Running 0 items in this shard: 2025-07-17T10:09:43.0152569Z 2025-07-17T10:09:43.0153022Z Running inductor/test_torchinductor_opinfo 8/17 ... [2025-07-17 10:09:43.014549] 2025-07-17T10:09:43.0153854Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:09:43.0156816Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=8', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:09:43.015136] 2025-07-17T10:09:51.6497425Z 2025-07-17T10:09:51.6499603Z inductor/test_torchinductor_opinfo 8/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.17_f02ab43bd548b2b3_.log 2025-07-17T10:09:51.6501243Z Running 0 items in this shard: 2025-07-17T10:09:51.6501596Z 2025-07-17T10:09:51.6502053Z Running inductor/test_torchinductor_opinfo 9/17 ... [2025-07-17 10:09:51.649473] 2025-07-17T10:09:51.6502881Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:09:51.6504807Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=9', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:09:51.649816] 2025-07-17T10:10:00.3339552Z 2025-07-17T10:10:00.3348724Z inductor/test_torchinductor_opinfo 9/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_9.17_b15d8546567374e5_.log 2025-07-17T10:10:00.3350324Z Running 0 items in this shard: 2025-07-17T10:10:00.3350674Z 2025-07-17T10:10:00.3351147Z Running inductor/test_torchinductor_opinfo 12/17 ... [2025-07-17 10:10:00.333860] 2025-07-17T10:10:00.3351987Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:10:00.3353901Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=12', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:10:00.334431] 2025-07-17T10:10:09.0189094Z 2025-07-17T10:10:09.0190705Z inductor/test_torchinductor_opinfo 12/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_12.17_ea5cfcc2be70fe63_.log 2025-07-17T10:10:09.0192488Z Running 0 items in this shard: 2025-07-17T10:10:09.0192906Z 2025-07-17T10:10:09.0193425Z Running inductor/test_torchinductor_opinfo 13/17 ... [2025-07-17 10:10:09.018277] 2025-07-17T10:10:09.0194353Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:10:09.0196330Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=13', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:10:09.018861] 2025-07-17T10:10:17.7034225Z 2025-07-17T10:10:17.7036129Z inductor/test_torchinductor_opinfo 13/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_13.17_f20c97e9cb753a9b_.log 2025-07-17T10:10:17.7037477Z Running 0 items in this shard: 2025-07-17T10:10:17.7037794Z 2025-07-17T10:10:17.7038205Z Running inductor/test_torchinductor_opinfo 16/17 ... [2025-07-17 10:10:17.703260] 2025-07-17T10:10:17.7038918Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:10:17.7043441Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=16', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:10:17.703801] 2025-07-17T10:10:26.2376952Z 2025-07-17T10:10:26.2378116Z inductor/test_torchinductor_opinfo 16/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_16.17_d51067624c897c7d_.log 2025-07-17T10:10:26.2378991Z Running 0 items in this shard: 2025-07-17T10:10:26.2379175Z 2025-07-17T10:10:26.2379433Z Running inductor/test_torchinductor_opinfo 17/17 ... [2025-07-17 10:10:26.237509] 2025-07-17T10:10:26.2379887Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:10:26.2383585Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=17', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:10:26.237816] 2025-07-17T10:10:34.8724231Z 2025-07-17T10:10:34.8726110Z inductor/test_torchinductor_opinfo 17/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_17.17_45039cf511afaa3f_.log 2025-07-17T10:10:34.8727744Z Running 0 items in this shard: 2025-07-17T10:10:34.8728083Z 2025-07-17T10:10:37.5301299Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-07-17T10:10:37.5304506Z import pkg_resources 2025-07-17T10:10:37.5498435Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-07-17T10:10:37.5500959Z import pkg_resources 2025-07-17T10:10:37.7810907Z Running inductor/test_aot_inductor 2/5 ... [2025-07-17 10:10:37.780390] 2025-07-17T10:10:37.7815233Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:10:37.7816136Z Running inductor/test_aot_inductor 5/5 ... [2025-07-17 10:10:37.780417] 2025-07-17T10:10:37.7816912Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:10:37.7818854Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=2', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:10:37.780731] 2025-07-17T10:10:37.7822447Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=5', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:10:37.780798] 2025-07-17T10:18:34.3560684Z 2025-07-17T10:18:34.3562173Z inductor/test_aot_inductor 2/5 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_2.5_dcb3c84772225ddb_.log 2025-07-17T10:18:34.3664266Z Running 164 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_profiler_enable_kernel_profile_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotuning_args_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_backward_no_op_logging_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_boolean_indexing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_3_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_4_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_composed_dynamic_size_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_share_predicte_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_multiple_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_convolution_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_duplicated_params_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_scalar_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_foreach_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_inf_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_dynamic_dim_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_mmaped_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misaligned_input_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_no_args_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_on_gpu_device1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_abs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_permute_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_squeeze_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeat_interleave_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeat_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_replicate_on_devices_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_large_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_shape_failed_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_dot_product_efficient_attention_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_False_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_from_multi_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_and_mul_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_so_without_weight_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_sympy_fn_like_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_weight_on_disk_legacy_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_sym_expr_cond_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_size_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aliased_buffer_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_amp_fallback_random_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_cpp_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_fp8_dtype_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_assert_tensor_meta_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bool_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_symint_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_unbacked_symint_closure_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_type_propagation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_conv_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dup_unbacked_sym_decl_with_refinement_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_scalar_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_embedding_bag_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fake_tensor_device_validation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fill__fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_index_put_with_none_index_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_inf_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_input_codegen_with_sympy_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_mmaped_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_tensor_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_on_gpu_device1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quanatized_int8_linear_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_replicate_on_devices_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_device_type_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scaled_dot_product_efficient_attention_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scatter_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_shifted_constraint_ranges_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_True_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_so_without_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_dynamic_shape_with_div_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_sympy_expr_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_sympy_fn_like_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_with_none_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_view_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_mixed_device_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_cudagraphs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aliased_buffer_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_constant_tensor_name_collision_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_user_defined_triton_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_profiler_enable_kernel_profile_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_runtime_asserts_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotune_with_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_mismatched_branch_output_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_mismatched_branch_output_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_nested_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_non_tensor_predicates_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_unbacked_symint_closure_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_outer_code_before_after_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_convolution_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dup_unbacked_sym_decl_with_refinement_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_cat_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fake_tensor_device_validation_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_foreach_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_free_inactive_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_dynamic_dim_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_masked_select_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_missing_cubin_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_multiple_output_alias_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_default_gpu_device_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_tensor_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_return_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_device_type_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_shape_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_same_backing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scatter_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_True_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_so_without_weight_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symbool_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_sympy_fn_like_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_weird_param_order_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_next_power_of_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_constant_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_using_model_name_for_files_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_conv_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_outer_buffers_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_sym_expr_cond_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_sym_expr_cond_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_unbacked_symint_closure_dynamic_True_mps 2025-07-17T10:18:34.3771555Z 2025-07-17T10:18:34.3771772Z Running inductor/test_torchinductor 1/2 ... [2025-07-17 10:18:34.356949] 2025-07-17T10:18:34.3772182Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:18:34.3773259Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:18:34.357624] 2025-07-17T10:19:10.8822461Z 2025-07-17T10:19:10.8823797Z inductor/test_aot_inductor 5/5 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_5.5_20c35af3d8216ffd_.log 2025-07-17T10:19:10.8884376Z Running 168 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__int_mm_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_addmm_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aliased_buffer_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_constant_tensor_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_cpp_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_fp8_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_user_defined_triton_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_profiler_enable_kernel_profile_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_assert_tensor_meta_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_mismatched_branch_output_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_symint_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_deconv_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fake_tensor_device_validation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_with_none_index_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_dynamic_maxautotune_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_cubin_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multiple_output_alias_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_narrow_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nested_tensor_from_jagged_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_none_args_aot_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_pad_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_poi_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_profile_benchmark_harness_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_seq_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_shifted_constraint_ranges_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_False_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symfloat_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sympy_cpp_printer_min_max_minmax1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_torchvision_transforms_functional_tensor_resize_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_dynamic_launcher_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_extern_kernel_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_with_none_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_inactive_constant_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_nested_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_conv_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_mixed_device_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_sym_expr_cond_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_unbacked_symint_closure_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_unbacked_symint_closure_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_cudagraphs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_offset_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_grid_with_backed_symbols_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_grid_with_unbacked_symbols_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_constant_tensor_name_collision_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_sym_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_user_defined_triton_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_runtime_asserts_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_assert_async_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bmm_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_mismatched_branch_output_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_non_tensor_predicates_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_reinterpret_view_inputs_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_cat_dtype_promotion_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_kernel_with_symexpr_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fft_c2c_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fp8_view_of_param_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fx_gm_return_tuple_validation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_issue_140766_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misaligned_input_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_nan_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_none_args_aot_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_pad_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_poi_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_pytree_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_complex_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_False_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_split_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_and_mul_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_torchvision_transforms_functional_tensor_resize_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_infer_from_tensor_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_multi_output_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_weird_param_order_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_constant_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_user_managed_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_using_model_name_for_files_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_weight_on_disk_legacy_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_nested_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_conv_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_parameters_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_sym_expr_cond_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_size_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_size_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_amp_fallback_random_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_constant_tensor_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_assert_async_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_backward_no_op_logging_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_and_force_mmap_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_symint_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fallback_kernel_with_symexpr_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fft_c2c_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fqn_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_index_put_with_none_index_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_issue_140766_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_mmaped_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_weight_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_no_args_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_on_gpu_device1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_output_path_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_hann_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_permute_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeat_interleave_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeat_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_replicate_on_devices_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_reuse_kernel_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_dtype_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sdpa_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_shifted_constraint_ranges_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_False_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_from_multi_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_small_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_subclasses_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sympy_cpp_printer_min_max_minmax0_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_torchvision_transforms_functional_tensor_resize_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_reinterpret_view_mem_leak_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_with_none_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_user_managed_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_unbacked_symint_closure_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_cudagraphs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_grid_with_unbacked_symbols_mps 2025-07-17T10:19:10.8937010Z 2025-07-17T10:19:10.8937183Z Running inductor/test_torchinductor 2/2 ... [2025-07-17 10:19:10.883097] 2025-07-17T10:19:10.8937525Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:19:10.8938338Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor.py', '-m', 'not serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:19:10.884320] 2025-07-17T10:25:28.9788189Z 2025-07-17T10:25:28.9789876Z inductor/test_torchinductor 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_1.2_ff48d272c09bebeb_.log 2025-07-17T10:25:28.9980266Z Running 443 items in this shard: test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_int, test/inductor/test_torchinductor.py::GPUTests::test_AllenaiLongformerBase_repro_cuda, test/inductor/test_torchinductor.py::GPUTests::test__dyn_quant_pack_4bit_weight_cuda, test/inductor/test_torchinductor.py::GPUTests::test__unsafe_masked_index_put_accumulate_cuda, test/inductor/test_torchinductor.py::GPUTests::test_abs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool1d_argmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool_with_output_size_0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_max_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_max_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_pool_errors_with_long_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_const_float_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_inplace_permuted_cuda, test/inductor/test_torchinductor.py::GPUTests::test_addmm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_alexnet_prefix_cuda, test/inductor/test_torchinductor.py::GPUTests::test_angle_cuda, test/inductor/test_torchinductor.py::GPUTests::test_any_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_with_persistent_cache_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_with_scalar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_as_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_alignment_op_name_fail_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_size_stride_op_name_fail_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_size_stride_op_name_pass_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_batch_norm_2d_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_batch_norm_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bitwise2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bmm2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_both_scalars_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_add_autotune_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_computed_offsets_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_default_kwargs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int64_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_float_ndigits_neg_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_int_ndigits_pos_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_empty_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_single_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_unbacked_empty_1d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cauchy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_chunk_recompiles_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clamp_type_promotion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clamp_type_promotion_non_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clone_cuda, test/inductor/test_torchinductor.py::GPUTests::test_compar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_complex_fallback_cuda, test/inductor/test_torchinductor.py::GPUTests::test_complex_memory_overlap_cuda, test/inductor/test_torchinductor.py::GPUTests::test_computed_buffer_inlining_cuda, test/inductor/test_torchinductor.py::GPUTests::test_concat_add_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_config_option_dont_assume_alignment_cudagraphs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_config_option_dont_assume_alignment_recompiles_cuda, test/inductor/test_torchinductor.py::GPUTests::test_consecutive_split_cumprod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_const_int32_to_float_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_3d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_fill_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_nd_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv2d_backward_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv2d_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv3d_channels_last_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv3d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_shape_check_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cudnn_rnn_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cummin_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumprod_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_inf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_pattern_matcher_issue_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_unbacked_symints_cuda, test/inductor/test_torchinductor.py::GPUTests::test_data_type_propogation_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dense_mask_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_deterministic_codegen_on_graph_break_cuda, test/inductor/test_torchinductor.py::GPUTests::test_deterministic_codegen_with_suffix_cuda, test/inductor/test_torchinductor.py::GPUTests::test_device_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_by_zero_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_precision_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dont_constant_fold_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_trivial_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtype_mismatch_issue_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_fusion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_elu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_bag_byte_unpack_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_bag_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_sparse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_empty_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_exp2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_exp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expand_as_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expand_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expanded_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_basic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_list_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_with_return_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fft_real_input_real_output_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fill1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float16_to_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float32_to_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float_index_expression_type_promotion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_floordiv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fmod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fmod_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_boolean_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_like_cuda, test/inductor/test_torchinductor.py::GPUTests::test_full_truncation_cuda, test/inductor/test_torchinductor.py::GPUTests::test_functionalize_rng_wrappers_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fuse_large_params_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fusing_write_into_disjoint_read_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gelu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_generate_rand_fp8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_generated_code_has_alignment_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_generated_code_has_size_stride_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_glu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_arange2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_argmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_both_scalars_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_constant_tensor1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_mutation_real_name_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_pad_dynamic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_refcount_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_scalar_inputs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_grid_sampler_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_hardtanh_cuda, test/inductor/test_torchinductor.py::GPUTests::test_horizonal_fusion1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_flip_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_floordiv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_remainder_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_as_masked_fill_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_failed_reinplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_fallback1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_fallback2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_indirect_load_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inductor_assert_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inductor_layout_optimization_input_mutations_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_mixed_dtype_ops_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_resize_as_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_where_pointwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_int8_weight_only_quant_cuda, test/inductor/test_torchinductor.py::GPUTests::test_int_input_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_invalid_operand_issue1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_isin_tensor_scalar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_isinf2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_grid_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_offset_pointwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_layer_norm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_lerp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_rands2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_rands_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear_dynamic_maxautotune_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_list_clearing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log1p_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logsumexp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_long_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_1_dim_3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_masked_fill_cuda, test/inductor/test_torchinductor.py::GPUTests::test_masked_fill_promotion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d6_dilation_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d6_dilation_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mean_cuda, test/inductor/test_torchinductor.py::GPUTests::test_min_max_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_misaligned_address_issue1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_move_arange_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_threading_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_any_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_prime_size_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_sum_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mutations_loop_fusion_cuda, test/inductor/test_torchinductor.py::GPUTests::test_narrow_cuda, test/inductor/test_torchinductor.py::GPUTests::test_needs_contiguous_strides_cuda, test/inductor/test_torchinductor.py::GPUTests::test_new_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_new_empty_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_new_ones_cuda, test/inductor/test_torchinductor.py::GPUTests::test_one_hot_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pad_cast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pad_single_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pad_view_cuda, test/inductor/test_torchinductor.py::GPUTests::test_philox_rand_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pixel_shuffle_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_airy_ai_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_y0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_y1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_t_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erfcx_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erfinv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_exp2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_gammaln_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_hermite_polynomial_h_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_laguerre_polynomial_l_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_legendre_polynomial_p_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_log1p_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_log_ndtr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_i0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_i1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_k1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_multigammaln_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_ndtri_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_psi_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_round_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_scaled_modified_bessel_k0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_v_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_w_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_spherical_bessel_j0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_zeta_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow_by_natural_log2_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_prod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_rand_like_deterministic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_kernel_count_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randn_with_dtype_and_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reflection_pad2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remainder_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_clone_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_copy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_slice1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_as_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_cuda, test/inductor/test_torchinductor.py::GPUTests::test_roi_align_cuda, test/inductor/test_torchinductor.py::GPUTests::test_round_correctness_cuda, test/inductor/test_torchinductor.py::GPUTests::test_round_cuda, test/inductor/test_torchinductor.py::GPUTests::test_rsqrt_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scalar_cpu_tensor_arg_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scalar_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scaled_dot_product_efficient_attention_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_add2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_bf16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_reduce2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_reduce3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scheduler_vertical_fusion1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_unaligned_mask_cuda, test/inductor/test_torchinductor.py::GPUTests::test_searchsorted_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_setitem_with_int_parameter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sgn_cuda, test/inductor/test_torchinductor.py::GPUTests::test_shape_prop_torch_ones_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sign_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_simplify_loops_cuda, test/inductor/test_torchinductor.py::GPUTests::test_single_elem_indirect_cuda, test/inductor/test_torchinductor.py::GPUTests::test_size_asserts_for_multi_output_fallback_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sizehint_issue1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_view_with_graph_break_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_one_kernel_persist_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_stable_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumprod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_failed_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_reduction_with_int64_size_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_list_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_sizes_with_unbacked_symints_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_unbacked_symints_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sqrt_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_squeeze1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_stack_cuda, test/inductor/test_torchinductor.py::GPUTests::test_strided_inputs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum_int_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum_keepdims_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tan_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tanh_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor_index_put_slice_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tmp_not_defined_issue3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_topk_cuda, test/inductor/test_torchinductor.py::GPUTests::test_transpose_add_cuda, test/inductor/test_torchinductor.py::GPUTests::test_transpose_cuda, test/inductor/test_torchinductor.py::GPUTests::test_uint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unroll_small_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unsqueeze_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unsqueeze_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_bicubic2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_cat_conv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest2d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_mean_div_by_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vdd_clamp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vertical_fusion1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_as_real_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_where_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_where_with_logical_op_cuda, test/inductor/test_torchinductor.py::GPUTests::test_zeros_cuda, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_bandwidth_profiler, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_codegen_config_option_dont_assume_alignment, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_comment_graph_fragment, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_computed_indirect_mask, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_constant_folding_deallocation, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_divisible_by_16_covers_numel_args, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_donated_buffer_inplace_gpt, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_evict_last_non_coalesced_loads, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_buffer_reuse, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_foreach_op, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_multiple_functions, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_symint_cat_backward, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_symint_from_mutation_index, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_symint_from_nested_indirect_indexing, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_unbacked_symint_multi_output_layout, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_indirect_device_assert, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_inductor_detach_view_backend_aot_eager, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_kernel_names_descriptive, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_layer_norm_inplaces_after_matmul, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_non_blocking_copy_codegen, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_numpy_autograd, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_indexing_dtype, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_red_followed_by_transposed_pointwise, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_sdpa_inference_mode_aot_compile, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_skip_l1_cache, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_split_op_with_sym, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_triton_attrs_dict_constexpr_signature, test/inductor/test_torchinductor.py::NanCheckerTest::test_nan_checker_fail 2025-07-17T10:25:29.0160249Z 2025-07-17T10:25:29.0160594Z Running inductor/test_torchinductor_opinfo 1/17 ... [2025-07-17 10:25:28.979832] 2025-07-17T10:25:29.0161230Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:25:29.0162698Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=1', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:25:28.980340] 2025-07-17T10:26:53.7738609Z 2025-07-17T10:26:53.7739656Z inductor/test_torchinductor 2/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_2.2_fbcc4704a5b6d2f0_.log 2025-07-17T10:26:53.7877823Z Running 500 items in this shard: test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast1_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast2_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_broadcast3_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_dense_transposed, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_double_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_int_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_broadcast2, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_double, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_strided_int, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_broadcast1, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_broadcast3, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_dense, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_strided, test/inductor/test_torchinductor.py::SweepInputsGPUTest::test_cuda_transposed_transposed, test/inductor/test_torchinductor.py::GPUTests::test__dyn_quant_matmul_4bit_cuda, test/inductor/test_torchinductor.py::GPUTests::test__unsafe_masked_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool2d_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_avg_pool_errors_with_long_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adaptive_max_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_complex_cuda, test/inductor/test_torchinductor.py::GPUTests::test_add_const_int_cuda, test/inductor/test_torchinductor.py::GPUTests::test_adding_tensor_offsets_cuda, test/inductor/test_torchinductor.py::GPUTests::test_addmv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aliased_buffer_reuse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_cache_hit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_dtype_device_layout_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_override_registration_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_support_out_cuda, test/inductor/test_torchinductor.py::GPUTests::test_aoti_eager_support_str_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_arange5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin_with_duplicates_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_argmin_with_nan_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_min_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_argmax_to_float_cuda, test/inductor/test_torchinductor.py::GPUTests::test_as_strided_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_assert_alignment_op_name_pass_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool2d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool3d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_avg_pool_errors_with_uint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_baddbmm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bernoulli1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bernoulli2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bfloat16_to_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bitwise3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bitwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bmm1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bool_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_broadcast_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int32_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_int8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_int_uint8_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_nd_tiling_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_bucketize_nd_tiling_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_batch_norm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_copied_in_graph_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_copied_in_graph_with_different_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_buffer_use_after_remove_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_float_ndigits_pos_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_float_ndigits_zero_cuda, test/inductor/test_torchinductor.py::GPUTests::test_builtins_round_int_ndigits_zero_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_extern_kernel_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_inplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_negative_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_of_loops_and_extern_kernel_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_unbacked_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_unbacked_legacy_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cat_upcasting_cuda, test/inductor/test_torchinductor.py::GPUTests::test_check_stack_no_cycles_cuda, test/inductor/test_torchinductor.py::GPUTests::test_clamp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_config_option_dont_assume_alignment_cuda, test/inductor/test_torchinductor.py::GPUTests::test_consecutive_split_cumsum_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_1d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_constant_pad_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv3d_channels_last_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_bn_fuse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_functional_bn_fuse_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_inference_heuristics_cuda, test/inductor/test_torchinductor.py::GPUTests::test_conv_with_as_strided_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_convolution5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cos_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_no_mask_cuda, test/inductor/test_torchinductor.py::GPUTests::test_cumsum_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_default_layout_constraint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_fixed_layout_channels_last_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_op_fixed_layout_sequential_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_op_compiled_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_op_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_op_multi_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_custom_scan_would_split_cuda, test/inductor/test_torchinductor.py::GPUTests::test_deterministic_codegen_cuda, test/inductor/test_torchinductor.py::GPUTests::test_diagonal_copy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dist_bf16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dist_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div9_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_prim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_softmax_symfloat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_div_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_deterministic_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dropout_trivial_0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtype_sympy_expr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_bfloat16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float16_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float32_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_float64_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int16_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int32_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_int8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int64_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_int8_int64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_float16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_dtypeview_uint8_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_embedding_cuda, test/inductor/test_torchinductor.py::GPUTests::test_empty1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_empty2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_erfc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_erfinv_cuda, test/inductor/test_torchinductor.py::GPUTests::test_exact_stride_cuda, test/inductor/test_torchinductor.py::GPUTests::test_expm1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_list_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_no_mutated_tensors_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fft_real_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fill2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_flip_cat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_flip_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float_index_expression_cuda, test/inductor/test_torchinductor.py::GPUTests::test_float_repr_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fmin_fmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_forced_buffer_realize_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fractional_max_pool2d5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_fuse_tiled_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_gather_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_getitem_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_arange1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_constant_tensor2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_misaligned_input_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_no_inputs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_graph_partition_unbacked_symint_as_output_cuda, test/inductor/test_torchinductor.py::GPUTests::test_hardsigmoid_cuda, test/inductor/test_torchinductor.py::GPUTests::test_hardswish_cuda, test/inductor/test_torchinductor.py::GPUTests::test_horizonal_fusion2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_dynamic_shapes_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_abs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_device_assert_masked_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_propagation_nested_indirect_indexing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_deterministic_fallback_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_put_reinplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_remainder_cuda, test/inductor/test_torchinductor.py::GPUTests::test_index_select_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inductor_multiple_specializations_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inner_fn_str_and_stride_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_activations_cuda, test/inductor/test_torchinductor.py::GPUTests::test_inplace_add_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_input_mutation5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_insignificant_strides_cuda, test/inductor/test_torchinductor.py::GPUTests::test_isinf_cuda, test/inductor/test_torchinductor.py::GPUTests::test_issue102546_cuda, test/inductor/test_torchinductor.py::GPUTests::test_kernel_names_cuda, test/inductor/test_torchinductor.py::GPUTests::test_kwargs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_l1_loss_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_broadcast_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_grid_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_pointwise_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_strided_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_large_tensor_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_leaky_relu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_lgamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_like_rands3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linear_mixed_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_linspace4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log_fp64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_log_softmax_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logaddexp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logcumsumexp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_logcumsumexp_zero_dim_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_low_memory_max_pool_dilation_2_dim_3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_masked_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_matmul_layer_norm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_min_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_max_pool2d_with_indices_backward6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_min_max_reduction_nan_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mix_device_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mixed_mm2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mixed_mm3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mixed_mm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mm_mixed_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mm_views_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mul_index_expr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mul_softmax_symfloat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_gpu_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multi_gpu_recompile_on_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_var_cuda, test/inductor/test_torchinductor.py::GPUTests::test_multilayer_var_lowp_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mutable_custom_op_fixed_layout2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_mutable_custom_op_fixed_layout_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nan_to_num_cuda, test/inductor/test_torchinductor.py::GPUTests::test_neg_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_neg_max_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nll_loss_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nll_loss_forward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_no_mega_fusion_during_lowering_cuda, test/inductor/test_torchinductor.py::GPUTests::test_no_op_reduction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_no_specization_over_symbolic_value_cuda, test/inductor/test_torchinductor.py::GPUTests::test_nonzero_unbacked_refinement_cuda, test/inductor/test_torchinductor.py::GPUTests::test_norm_constant_overflow_cuda, test/inductor/test_torchinductor.py::GPUTests::test_output_strides_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pattern_matcher_multi_user_cuda, test/inductor/test_torchinductor.py::GPUTests::test_permute1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_permute2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_j0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_bessel_j1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_u_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_v_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_chebyshev_polynomial_w_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_digamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_entr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_erfc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_expit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_expm1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_gammainc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_gammaincc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_hermite_polynomial_he_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i0e_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_i1e_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_logit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_modified_bessel_k0_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_ndtr_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_polygamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_scaled_modified_bessel_k1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_t_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_shifted_chebyshev_polynomial_u_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_sinc_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_xlog1py_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pointwise_xlogy_cuda, test/inductor/test_torchinductor.py::GPUTests::test_polar_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow_int_cuda, test/inductor/test_torchinductor.py::GPUTests::test_pow_symfloat_cuda, test/inductor/test_torchinductor.py::GPUTests::test_prepare_softmax_with_fast_math_cuda, test/inductor/test_torchinductor.py::GPUTests::test_profiler_mark_wrapper_call_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_distribution_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randint_int64_mod_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randn_generator_cuda, test/inductor/test_torchinductor.py::GPUTests::test_randn_like_empty_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reduction_config_limit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reflection_pad2d_backward_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reinterpret_dtypeview_cuda, test/inductor/test_torchinductor.py::GPUTests::test_relu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_no_ops_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_slice_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_slice_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_view_default_cuda, test/inductor/test_torchinductor.py::GPUTests::test_remove_noop_view_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_repeat_interleave_2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_replication_pad_errors_with_bool_cuda, test/inductor/test_torchinductor.py::GPUTests::test_require_stride_expanded_cuda, test/inductor/test_torchinductor.py::GPUTests::test_resize_as_cuda, test/inductor/test_torchinductor.py::GPUTests::test_resize_cuda, test/inductor/test_torchinductor.py::GPUTests::test_reuse_buffers_with_aliasing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_roll_cuda, test/inductor/test_torchinductor.py::GPUTests::test_rsqrt_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scalar_output_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scaled_dot_product_attention_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_add1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_add3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_scatter_reduce1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sdpa_unaligned_mask_freezing_cuda, test/inductor/test_torchinductor.py::GPUTests::test_searchsorted_cuda, test/inductor/test_torchinductor.py::GPUTests::test_select_scatter_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sgn_extremal_cuda, test/inductor/test_torchinductor.py::GPUTests::test_shape_padding_cuda, test/inductor/test_torchinductor.py::GPUTests::test_should_pad_bench_for_bmm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sigmoid_cuda, test/inductor/test_torchinductor.py::GPUTests::test_signbit_cuda, test/inductor/test_torchinductor.py::GPUTests::test_silu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sin_cuda, test/inductor/test_torchinductor.py::GPUTests::test_single_elem_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_mutation1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_mutation2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_mutation3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_slice_scatter_reinplace_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_backward_data_cuda, test/inductor/test_torchinductor.py::GPUTests::test_softmax_one_kernel_loop_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_bool_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sort_transpose_cuda, test/inductor/test_torchinductor.py::GPUTests::test_special_polygamma_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumprod_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumsum_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumsum_index_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_cumsum_low_prec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_reduction_dynamic_shape_cuda, test/inductor/test_torchinductor.py::GPUTests::test_split_with_integer_cuda, test/inductor/test_torchinductor.py::GPUTests::test_squeeze2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_squeeze_varargs_cuda, test/inductor/test_torchinductor.py::GPUTests::test_std_cuda, test/inductor/test_torchinductor.py::GPUTests::test_stride_preservation_with_stride_modifying_fx_pass_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum1_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum3_cuda, test/inductor/test_torchinductor.py::GPUTests::test_sum5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tensor_index_slice_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tmp_not_defined_issue1_use_block_ptr_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_tmp_not_defined_issue2_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_device_constant_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_device_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_dtype_cuda, test/inductor/test_torchinductor.py::GPUTests::test_to_memory_format_cuda, test/inductor/test_torchinductor.py::GPUTests::test_transposed_propagates_cuda, test/inductor/test_torchinductor.py::GPUTests::test_triu_cuda, test/inductor/test_torchinductor.py::GPUTests::test_uint4x2_mixed_mm_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unbacked_floordiv_simplify_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unbacked_floordiv_simplify_errors_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unbind_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unfold_zero_dimension_tensor_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_bfloat16_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_float32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_float64_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_int32_cuda, test/inductor/test_torchinductor.py::GPUTests::test_unspec_inputs_uint8_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_bilinear2d_a_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_bilinear2d_b_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest1d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest2d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_upsample_nearest3d_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_correction_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_mean_tile_reduction_False_cuda, test/inductor/test_torchinductor.py::GPUTests::test_var_mean_tile_reduction_True_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vectorized_ops_masked_cuda, test/inductor/test_torchinductor.py::GPUTests::test_vectorized_ops_masked_var_novec_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_as_complex_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_detach_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_on_aliased_cuda, test/inductor/test_torchinductor.py::GPUTests::test_view_uint8_through_differing_bitwidths_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views4_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views5_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views6_cuda, test/inductor/test_torchinductor.py::GPUTests::test_views7_cuda, test/inductor/test_torchinductor.py::GPUTests::test_weight_norm_bwd_cuda, test/inductor/test_torchinductor.py::GPUTests::test_xblock_divides_xnumel_cuda, test/inductor/test_torchinductor.py::GPUTests::test_zero_dim_reductions_cuda, test/inductor/test_torchinductor.py::GPUTests::test_zero_element_mutation_cuda, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_cant_optimize_compute, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_ctr_not_moved_to_cuda_when_used_in_index_put, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_donated_buffer_inplace, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_evict_last_non_coalesced_loads_block_ptr, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_condition_op, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_dynamic_scalar_inputs, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_fused_scheduler_node, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_item, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_symint, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_graph_partition_unbacked_symint, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_grouped_mm, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_has_constant_mask_block_multiple_False_ynumel_exceed_ygrid_size_False, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_has_constant_mask_block_multiple_True_ynumel_exceed_ygrid_size_False, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_has_constant_mask_block_multiple_True_ynumel_exceed_ygrid_size_True, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_inductor_detach_view_backend_inductor, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_inductor_sequence_nr, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_not_materialize_pointwise_reduction, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_numpy_on_gpu, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_compute, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_indexing_assert, test/inductor/test_torchinductor.py::TritonCodeGenTests::test_optimize_indexing_dtype_with_constraint, test/inductor/test_torchinductor.py::RNNTest::test_rnn_compile_safe, test/inductor/test_torchinductor.py::NanCheckerTest::test_nan_checker_pass 2025-07-17T10:26:53.8022912Z 2025-07-17T10:26:53.8023165Z Running inductor/test_torchinductor_opinfo 4/17 ... [2025-07-17 10:26:53.774561] 2025-07-17T10:26:53.8023599Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:26:53.8024606Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=4', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:26:53.774936] 2025-07-17T10:33:29.4366413Z 2025-07-17T10:33:29.4372421Z inductor/test_torchinductor_opinfo 4/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_4.17_559d9a56e781b351_.log 2025-07-17T10:33:29.4485351Z Running 185 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___ror___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rxor___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__native_batch_norm_legit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_baddbmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_max_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geqrf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_celu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_elu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_grid_sample_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_prelu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_nuc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polar_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_kaiser_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_erfcx_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_uint32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_chunk_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float64 2025-07-17T10:33:29.4615347Z 2025-07-17T10:33:29.4615621Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-07-17T10:33:29.4616274Z Running inductor/test_torchinductor_opinfo 5/17 ... [2025-07-17 10:33:29.436560] 2025-07-17T10:33:29.4616871Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:33:29.4617266Z Uploading artifacts took 0.00 seconds 2025-07-17T10:33:29.4618486Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=5', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:33:29.436915] 2025-07-17T10:33:57.8362829Z 2025-07-17T10:33:57.8364843Z inductor/test_torchinductor_opinfo 1/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_1.17_3c6fbf64cf28ddd2_.log 2025-07-17T10:33:57.8453829Z Running 201 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__segment_reduce_lengths_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_decomposed_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_1d_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_not_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cauchy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gradient_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hypot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cond_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eig_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_var_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_var_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_grid_sample_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_kl_div_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_selu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_erfcx_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_unbiased_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_uniform_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_int32 2025-07-17T10:33:57.8518120Z 2025-07-17T10:33:57.8518353Z Running inductor/test_torchinductor_opinfo 8/17 ... [2025-07-17 10:33:57.837187] 2025-07-17T10:33:57.8518812Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:33:57.8519782Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=8', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:33:57.837778] 2025-07-17T10:43:16.2962964Z 2025-07-17T10:43:16.2964034Z inductor/test_torchinductor_opinfo 5/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_5.17_a06448189fe331b0_.log 2025-07-17T10:43:16.3043904Z Running 241 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__segment_reduce_offsets_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addbmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addbmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_baddbmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_right_shift_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_inverse_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_max_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flatten_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_fill_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_fill_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_det_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_householder_product_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_rank_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logaddexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_normalize_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_std_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_dropout_backward_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_glu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_group_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_rms_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_nuc_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_number_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_renorm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_indices_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unravel_index_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_int64 2025-07-17T10:43:16.3122296Z 2025-07-17T10:43:16.3122535Z Running inductor/test_torchinductor_opinfo 9/17 ... [2025-07-17 10:43:16.295958] 2025-07-17T10:43:16.3123015Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:43:16.3123932Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=9', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:43:16.296297] 2025-07-17T10:43:16.8821389Z 2025-07-17T10:43:16.8822572Z inductor/test_torchinductor_opinfo 8/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.17_2f6af4e55cc6497e_.log 2025-07-17T10:43:16.8908144Z Running 190 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_angle_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_left_shift_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_not_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igamma_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cond_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eig_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_singular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_svd_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_unpack_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logsumexp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_select_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanquantile_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nextafter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_celu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_elu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_selu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_inf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_inf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_0_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float64 2025-07-17T10:43:16.8986073Z 2025-07-17T10:43:16.8986400Z Running inductor/test_torchinductor_opinfo 12/17 ... [2025-07-17 10:43:16.882683] 2025-07-17T10:43:16.8986971Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:43:16.8988115Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=12', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:43:16.882984] 2025-07-17T10:50:29.3485899Z 2025-07-17T10:50:29.3486960Z inductor/test_torchinductor_opinfo 12/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_12.17_d2d619c266be534d_.log 2025-07-17T10:50:29.3585678Z Running 194 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___ror___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__native_batch_norm_legit_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_baddbmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lcm_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_inv_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_singular_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logsumexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_batch_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_batch_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_zeros_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_group_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_linear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_silu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pca_lowrank_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pow_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pow_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_sampled_addmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_lowrank_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_lowrank_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensordot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tile_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_indices_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_indices_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unravel_index_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_complex_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float32 2025-07-17T10:50:29.3701517Z 2025-07-17T10:50:29.3701990Z Running inductor/test_torchinductor_opinfo 13/17 ... [2025-07-17 10:50:29.348674] 2025-07-17T10:50:29.3702814Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:50:29.3704771Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=13', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:50:29.348954] 2025-07-17T10:51:19.4264484Z 2025-07-17T10:51:19.4266773Z inductor/test_torchinductor_opinfo 9/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_9.17_27bcb465fd5bf04f_.log 2025-07-17T10:51:19.4363260Z Running 217 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmatmul___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmod___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bernoulli_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_right_shift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_complex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gcd_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igammac_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igammac_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_fill_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_fill_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_inner_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_det_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorinv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorinv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_median_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_var_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_dropout_backward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_kl_div_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_kl_div_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_inf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ormqr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_quantile_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_bartlett_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_cosine_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_gaussian_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hamming_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_erfcx_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tile_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_chunk_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vdot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float16 2025-07-17T10:51:19.4482143Z 2025-07-17T10:51:19.4482353Z Running inductor/test_torchinductor_opinfo 16/17 ... [2025-07-17 10:51:19.426650] 2025-07-17T10:51:19.4482714Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:51:19.4483556Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=16', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:51:19.427004] 2025-07-17T10:59:47.5603960Z 2025-07-17T10:59:47.5609531Z inductor/test_torchinductor_opinfo 13/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_13.17_ec91a4949779e57d_.log 2025-07-17T10:59:47.5773591Z Running 209 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__softmax_backward_data_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_decomposed_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chalf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_fill_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lerp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_triangular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorsolve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_select_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_linear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_normalize_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pow_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_neg_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_exponential_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_kaiser_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tile_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triangular_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_chunk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_unbiased_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_float32 2025-07-17T10:59:47.5932014Z 2025-07-17T10:59:47.5932430Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-07-17T10:59:47.5933205Z Uploading artifacts took 0.00 seconds 2025-07-17T10:59:47.5933938Z Running inductor/test_torchinductor_opinfo 17/17 ... [2025-07-17 10:59:47.560467] 2025-07-17T10:59:47.5934697Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-07-17T10:59:47.5936701Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=17', '--num-shards=17', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-07-17 10:59:47.561096] 2025-07-17T11:00:14.8931837Z 2025-07-17T11:00:14.8932837Z inductor/test_torchinductor_opinfo 16/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_16.17_5f054986af962439_.log 2025-07-17T11:00:14.9027261Z Running 217 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmod___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___ror___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__segment_reduce_offsets_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_decomposed_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_left_shift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chalf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chalf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_inverse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_max_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_max_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flatten_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gradient_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_inner_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lcm_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lerp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_solve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_normal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_log_softmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_var_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matmul_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pdist_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ormqr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pinverse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_exponential_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_uniform_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_uint64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_int64 2025-07-17T11:00:14.9114746Z 2025-07-17T11:08:16.1584038Z 2025-07-17T11:08:16.1586215Z inductor/test_torchinductor_opinfo 17/17 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_17.17_8732374584867fba_.log 2025-07-17T11:08:16.1728971Z Running 204 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmatmul___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmod___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_angle_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_angle_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_1d_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bernoulli_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bincount_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdist_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_uint32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvalsh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_inv_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_power_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logaddexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logsumexp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_group_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_rrelu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_nuttall_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unravel_index_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_unbiased_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_uint8 2025-07-17T11:08:16.1871997Z 2025-07-17T11:08:17.1062710Z Running test batch 'tests to run' cost 3591.62 seconds 2025-07-17T11:08:17.7237135Z + [[ 2 == 1 ]] 2025-07-17T11:08:17.7275972Z + sccache_epilogue 2025-07-17T11:08:17.7276782Z + echo '::group::Sccache Compilation Log' 2025-07-17T11:08:17.7277501Z ##[group]Sccache Compilation Log 2025-07-17T11:08:17.7277859Z + echo '=================== sccache compilation log ===================' 2025-07-17T11:08:17.7278226Z =================== sccache compilation log =================== 2025-07-17T11:08:17.7278908Z + python /var/lib/jenkins/pytorch/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-07-17T11:08:17.7417180Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-07-17T11:08:17.7418182Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-07-17T11:08:17.7418836Z + sccache --show-stats 2025-07-17T11:08:17.7467467Z Compile requests 6061 2025-07-17T11:08:17.7467827Z Compile requests executed 471 2025-07-17T11:08:17.7468127Z Cache hits 32 2025-07-17T11:08:17.7468397Z Cache hits (C/C++) 32 2025-07-17T11:08:17.7468699Z Cache misses 439 2025-07-17T11:08:17.7468964Z Cache misses (C/C++) 433 2025-07-17T11:08:17.7469238Z Cache misses (HIP) 6 2025-07-17T11:08:17.7469525Z Cache hits rate 6.79 % 2025-07-17T11:08:17.7469819Z Cache hits rate (C/C++) 6.88 % 2025-07-17T11:08:17.7470110Z Cache hits rate (HIP) 0.00 % 2025-07-17T11:08:17.7470393Z Cache timeouts 0 2025-07-17T11:08:17.7470667Z Cache read errors 0 2025-07-17T11:08:17.7471169Z Forced recaches 0 2025-07-17T11:08:17.7471452Z Cache write errors 0 2025-07-17T11:08:17.7471677Z Cache errors 0 2025-07-17T11:08:17.7471901Z Compilations 439 2025-07-17T11:08:17.7472179Z Compilation failures 0 2025-07-17T11:08:17.7472418Z Non-cacheable compilations 0 2025-07-17T11:08:17.7472904Z Non-cacheable calls 249 2025-07-17T11:08:17.7473126Z Non-compilation calls 5341 2025-07-17T11:08:17.7473452Z Unsupported compiler calls 0 2025-07-17T11:08:17.7473681Z Average cache write 0.000 s 2025-07-17T11:08:17.7473912Z Average compiler 2.443 s 2025-07-17T11:08:17.7474132Z Average cache read hit 0.000 s 2025-07-17T11:08:17.7474367Z Failed distributed compilations 0 2025-07-17T11:08:17.7474519Z 2025-07-17T11:08:17.7474601Z Non-cacheable reasons: 2025-07-17T11:08:17.7474805Z unknown source language 211 2025-07-17T11:08:17.7475031Z -E 38 2025-07-17T11:08:17.7475177Z 2025-07-17T11:08:17.7475323Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-07-17T11:08:17.7475631Z Use direct/preprocessor mode? yes 2025-07-17T11:08:17.7475864Z Version (client) 0.10.0 2025-07-17T11:08:17.7476088Z Cache size 22 MiB 2025-07-17T11:08:17.7476328Z Max cache size 10 GiB 2025-07-17T11:08:17.7476557Z + sccache --stop-server 2025-07-17T11:08:17.7490473Z Stopping sccache server... 2025-07-17T11:08:17.7493774Z Compile requests 6061 2025-07-17T11:08:17.7494145Z Compile requests executed 471 2025-07-17T11:08:17.7494446Z Cache hits 32 2025-07-17T11:08:17.7494718Z Cache hits (C/C++) 32 2025-07-17T11:08:17.7494999Z Cache misses 439 2025-07-17T11:08:17.7495273Z Cache misses (C/C++) 433 2025-07-17T11:08:17.7506888Z Cache misses (HIP) 6 2025-07-17T11:08:17.7507463Z Cache hits rate 6.79 % 2025-07-17T11:08:17.7508013Z Cache hits rate (C/C++) 6.88 % 2025-07-17T11:08:17.7508541Z Cache hits rate (HIP) 0.00 % 2025-07-17T11:08:17.7509068Z Cache timeouts 0 2025-07-17T11:08:17.7509391Z Cache read errors 0 2025-07-17T11:08:17.7509672Z Forced recaches 0 2025-07-17T11:08:17.7509940Z Cache write errors 0 2025-07-17T11:08:17.7510215Z Cache errors 0 2025-07-17T11:08:17.7510482Z Compilations 439 2025-07-17T11:08:17.7510754Z Compilation failures 0 2025-07-17T11:08:17.7511036Z Non-cacheable compilations 0 2025-07-17T11:08:17.7511449Z Non-cacheable calls 249 2025-07-17T11:08:17.7511734Z Non-compilation calls 5341 2025-07-17T11:08:17.7512008Z Unsupported compiler calls 0 2025-07-17T11:08:17.7512291Z Average cache write 0.000 s 2025-07-17T11:08:17.7512590Z Average compiler 2.443 s 2025-07-17T11:08:17.7512870Z Average cache read hit 0.000 s 2025-07-17T11:08:17.7513158Z Failed distributed compilations 0 2025-07-17T11:08:17.7513372Z 2025-07-17T11:08:17.7513473Z Non-cacheable reasons: 2025-07-17T11:08:17.7513722Z unknown source language 211 2025-07-17T11:08:17.7513999Z -E 38 2025-07-17T11:08:17.7514179Z 2025-07-17T11:08:17.7514372Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-07-17T11:08:17.7514773Z Use direct/preprocessor mode? yes 2025-07-17T11:08:17.7515063Z Version (client) 0.10.0 2025-07-17T11:08:17.7515346Z Cache size 22 MiB 2025-07-17T11:08:17.7515637Z Max cache size 10 GiB 2025-07-17T11:08:17.7515927Z + echo ::endgroup:: 2025-07-17T11:08:17.7516389Z ##[endgroup] 2025-07-17T11:08:17.7656762Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-07-17T11:08:17.7657536Z # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-07-17T11:08:17.7658437Z docker exec -t "55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test" 2025-07-17T11:08:17.7705293Z shell: /usr/bin/bash -e {0} 2025-07-17T11:08:17.7705774Z env: 2025-07-17T11:08:17.7706172Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:17.7706891Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:17.7707980Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:17.7708962Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:17.7710688Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:17.7712217Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:17.7712715Z AWS_REGION: us-east-1 2025-07-17T11:08:17.7713288Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:17.7713951Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:17.7723691Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:17.7724412Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:17.7725173Z ##[endgroup] 2025-07-17T11:08:17.9901108Z ##[group]Run cat test/**/*_toprint.log || true 2025-07-17T11:08:17.9901890Z cat test/**/*_toprint.log || true 2025-07-17T11:08:17.9954924Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:17.9955594Z env: 2025-07-17T11:08:17.9955991Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:17.9956719Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:17.9958094Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:17.9959088Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:17.9960879Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:17.9962403Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:17.9962900Z AWS_REGION: us-east-1 2025-07-17T11:08:17.9963535Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:17.9964199Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:17.9974744Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:17.9975527Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:17.9976356Z ##[endgroup] 2025-07-17T11:08:18.0154380Z cat: 'test/**/*_toprint.log': No such file or directory 2025-07-17T11:08:18.0409959Z Prepare all required actions 2025-07-17T11:08:18.0411878Z Getting action download info 2025-07-17T11:08:18.4668749Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-07-17T11:08:19.0706558Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-07-17T11:08:19.7999126Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-07-17T11:08:19.7999464Z with: 2025-07-17T11:08:19.7999644Z use-gha: true 2025-07-17T11:08:19.7999892Z file-suffix: test-inductor-2-2-linux.rocm.gpu.2_46159470975 2025-07-17T11:08:19.8000191Z s3-bucket: gha-artifacts 2025-07-17T11:08:19.8000385Z env: 2025-07-17T11:08:19.8000543Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:19.8000847Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:19.8001275Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:19.8001697Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:19.8002399Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:19.8003031Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:19.8003242Z AWS_REGION: us-east-1 2025-07-17T11:08:19.8003479Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:19.8003908Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:19.8007851Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:19.8008156Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:19.8008485Z ##[endgroup] 2025-07-17T11:08:19.8088957Z ##[group]Run actions/upload-artifact@v4 2025-07-17T11:08:19.8089179Z with: 2025-07-17T11:08:19.8089466Z name: test-jsons-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip 2025-07-17T11:08:19.8089815Z retention-days: 14 2025-07-17T11:08:19.8090004Z if-no-files-found: warn 2025-07-17T11:08:19.8090194Z path: test/**/*.json 2025-07-17T11:08:19.8090385Z compression-level: 6 2025-07-17T11:08:19.8090566Z overwrite: false 2025-07-17T11:08:19.8090744Z include-hidden-files: false 2025-07-17T11:08:19.8090989Z env: 2025-07-17T11:08:19.8091163Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:19.8091514Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:19.8092032Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:19.8092535Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:19.8093382Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:19.8094016Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:19.8094216Z AWS_REGION: us-east-1 2025-07-17T11:08:19.8094439Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:19.8094717Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:19.8098806Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:19.8099107Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:19.8099441Z ##[endgroup] 2025-07-17T11:08:20.5038014Z With the provided path, there will be 7 files uploaded 2025-07-17T11:08:20.5043935Z Artifact name is valid! 2025-07-17T11:08:20.5044527Z Root directory input is valid! 2025-07-17T11:08:21.9938713Z Beginning upload of artifact content to blob storage 2025-07-17T11:08:22.5034257Z Uploaded bytes 42627 2025-07-17T11:08:22.5927724Z Finished uploading artifact content to blob storage! 2025-07-17T11:08:22.5933281Z SHA256 digest of uploaded artifact zip is 86a37dd7e2905057e508e193b588357de838a25fa236303222ed8282adcdd9cf 2025-07-17T11:08:22.5936132Z Finalizing artifact upload 2025-07-17T11:08:22.7346904Z Artifact test-jsons-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip.zip successfully finalized. Artifact ID 3553754145 2025-07-17T11:08:22.7349192Z Artifact test-jsons-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip has been successfully uploaded! Final size is 42627 bytes. Artifact ID is 3553754145 2025-07-17T11:08:22.7359818Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/16337959866/artifacts/3553754145 2025-07-17T11:08:22.7713375Z ##[group]Run actions/upload-artifact@v4 2025-07-17T11:08:22.7713962Z with: 2025-07-17T11:08:22.7714743Z name: test-reports-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip 2025-07-17T11:08:22.7715643Z retention-days: 14 2025-07-17T11:08:22.7716124Z if-no-files-found: ignore 2025-07-17T11:08:22.7716644Z path: test/**/*.xml test/**/*.csv 2025-07-17T11:08:22.7717184Z compression-level: 6 2025-07-17T11:08:22.7717638Z overwrite: false 2025-07-17T11:08:22.7718091Z include-hidden-files: false 2025-07-17T11:08:22.7718577Z env: 2025-07-17T11:08:22.7718968Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:22.7719889Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:22.7721015Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:22.7722000Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:22.7723670Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:22.7725550Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:22.7726059Z AWS_REGION: us-east-1 2025-07-17T11:08:22.7726633Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:22.7727293Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:22.7737924Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:22.7738839Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:22.7739661Z ##[endgroup] 2025-07-17T11:08:23.5118958Z With the provided path, there will be 66 files uploaded 2025-07-17T11:08:23.5124041Z Artifact name is valid! 2025-07-17T11:08:23.5124655Z Root directory input is valid! 2025-07-17T11:08:25.1047204Z Beginning upload of artifact content to blob storage 2025-07-17T11:08:26.0945659Z Uploaded bytes 501321 2025-07-17T11:08:26.1827190Z Finished uploading artifact content to blob storage! 2025-07-17T11:08:26.1832928Z SHA256 digest of uploaded artifact zip is 9fa97fded1dac8bb025186d84d7dfd04c3473939c6c63c70a57f1bd9ce54e3b4 2025-07-17T11:08:26.1835137Z Finalizing artifact upload 2025-07-17T11:08:26.3202295Z Artifact test-reports-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip.zip successfully finalized. Artifact ID 3553754533 2025-07-17T11:08:26.3204592Z Artifact test-reports-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip has been successfully uploaded! Final size is 501321 bytes. Artifact ID is 3553754533 2025-07-17T11:08:26.3215790Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/16337959866/artifacts/3553754533 2025-07-17T11:08:26.3563610Z ##[group]Run actions/upload-artifact@v4 2025-07-17T11:08:26.3564467Z with: 2025-07-17T11:08:26.3565155Z name: logs-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip 2025-07-17T11:08:26.3565993Z retention-days: 14 2025-07-17T11:08:26.3566482Z if-no-files-found: ignore 2025-07-17T11:08:26.3567013Z path: usage_log.txt test/**/*.log 2025-07-17T11:08:26.3567555Z compression-level: 6 2025-07-17T11:08:26.3568009Z overwrite: false 2025-07-17T11:08:26.3568489Z include-hidden-files: false 2025-07-17T11:08:26.3568974Z env: 2025-07-17T11:08:26.3569373Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:26.3570137Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:26.3571222Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:26.3572217Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:26.3574393Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:26.3576010Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:26.3576552Z AWS_REGION: us-east-1 2025-07-17T11:08:26.3577141Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:26.3577819Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:26.3587502Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:26.3588282Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:26.3589131Z ##[endgroup] 2025-07-17T11:08:27.1158283Z Multiple search paths detected. Calculating the least common ancestor of all paths 2025-07-17T11:08:27.1160667Z The least common ancestor is /home/pytorchci/actions-runner/_work/pytorch/pytorch. This will be the root directory of the artifact 2025-07-17T11:08:27.1161984Z With the provided path, there will be 64 files uploaded 2025-07-17T11:08:27.1164824Z Artifact name is valid! 2025-07-17T11:08:27.1165808Z Root directory input is valid! 2025-07-17T11:08:28.7128505Z Beginning upload of artifact content to blob storage 2025-07-17T11:08:30.1837300Z Uploaded bytes 1227092 2025-07-17T11:08:30.2750589Z Finished uploading artifact content to blob storage! 2025-07-17T11:08:30.2756261Z SHA256 digest of uploaded artifact zip is 61ff37cfbf00ace2d0b43015938d42dae9917c4360f2cc1b3aa5a6e1b83c467f 2025-07-17T11:08:30.2759123Z Finalizing artifact upload 2025-07-17T11:08:30.4115850Z Artifact logs-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip.zip successfully finalized. Artifact ID 3553754992 2025-07-17T11:08:30.4118559Z Artifact logs-runattempt1-test-inductor-2-2-linux.rocm.gpu.2_46159470975.zip has been successfully uploaded! Final size is 1227092 bytes. Artifact ID is 3553754992 2025-07-17T11:08:30.4129458Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/16337959866/artifacts/3553754992 2025-07-17T11:08:30.4495738Z ##[group]Run # shellcheck disable=SC2156 2025-07-17T11:08:30.4496580Z # shellcheck disable=SC2156 2025-07-17T11:08:30.4497621Z find . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-07-17T11:08:30.4548774Z shell: /usr/bin/bash -e {0} 2025-07-17T11:08:30.4549287Z env: 2025-07-17T11:08:30.4549721Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:30.4550493Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:30.4551604Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:30.4552664Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:30.4554408Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:30.4556021Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:30.4556558Z AWS_REGION: us-east-1 2025-07-17T11:08:30.4557155Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:30.4557840Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:30.4567978Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:30.4568761Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:30.4569596Z ##[endgroup] 2025-07-17T11:08:30.8463696Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-07-17T11:08:30.8464688Z with: 2025-07-17T11:08:30.8465550Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_upload-benchmark-results 2025-07-17T11:08:30.8466627Z role-duration-seconds: 18000 2025-07-17T11:08:30.8467289Z aws-region: us-east-1 2025-07-17T11:08:30.8467883Z audience: sts.amazonaws.com 2025-07-17T11:08:30.8468398Z env: 2025-07-17T11:08:30.8468815Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:30.8469574Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:30.8470647Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:30.8471676Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:30.8473465Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:30.8474998Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:30.8475545Z AWS_REGION: us-east-1 2025-07-17T11:08:30.8476155Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:30.8476856Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:30.8487140Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:30.8488081Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:30.8489103Z ##[endgroup] 2025-07-17T11:08:31.2238194Z Assuming role with OIDC 2025-07-17T11:08:31.5853594Z Authenticated as assumedRoleId AROAUPVRELQNA5GQHA6IA:GitHubActions 2025-07-17T11:08:31.7094837Z ##[group]Run pytorch/test-infra/.github/actions/upload-benchmark-results@main 2025-07-17T11:08:31.7095891Z with: 2025-07-17T11:08:31.7096510Z benchmark-results-dir: test/test-reports 2025-07-17T11:08:31.7097173Z dry-run: false 2025-07-17T11:08:31.7097643Z schema-version: v3 2025-07-17T11:08:31.7098610Z github-token: *** 2025-07-17T11:08:31.7099088Z env: 2025-07-17T11:08:31.7099534Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:31.7100310Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:31.7101782Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:31.7102873Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:31.7104647Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:31.7106317Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:31.7106874Z AWS_REGION: us-east-1 2025-07-17T11:08:31.7107450Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:31.7108160Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:31.7119680Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:31.7120489Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:31.7121360Z ##[endgroup] 2025-07-17T11:08:31.7158411Z ##[group]Run set -eux 2025-07-17T11:08:31.7158994Z set -eux 2025-07-17T11:08:31.7159896Z python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-07-17T11:08:31.7160719Z  2025-07-17T11:08:31.7161155Z DEVICE_NAME="" 2025-07-17T11:08:31.7161693Z DEVICE_TYPE="" 2025-07-17T11:08:31.7162174Z  2025-07-17T11:08:31.7162646Z if command -v nvidia-smi; then 2025-07-17T11:08:31.7163504Z  # NB: I'm using PyTorch here to get the device name, however, it needs to 2025-07-17T11:08:31.7164602Z  # install the correct version of PyTorch manually for now. Any PyTorch 2025-07-17T11:08:31.7165598Z  # version is fine, I just use 2.7.1 to satify PYPIDEP linter 2025-07-17T11:08:31.7166689Z  python3 -mpip install torch==2.7.1 2025-07-17T11:08:31.7167346Z elif command -v rocminfo; then 2025-07-17T11:08:31.7168128Z  # NB: Installing torch on ROCm runner with pip here causes CI to fail 2025-07-17T11:08:31.7169119Z  # with a memoryview is too large error only on MI300 runners. Is pip 2025-07-17T11:08:31.7170069Z  # version on ROCm runner there too old? As a workaround, let's use the 2025-07-17T11:08:31.7170918Z  # GPU device name coming from rocminfo instead 2025-07-17T11:08:31.7171561Z  DEVICE_NAME=rocm 2025-07-17T11:08:31.7172392Z  DEVICE_TYPE=$(rocminfo | grep "Marketing Name" | tail -n1 | awk -F':' '{print $2}' | xargs) 2025-07-17T11:08:31.7173248Z fi 2025-07-17T11:08:31.7173645Z  2025-07-17T11:08:31.7174144Z echo "DEVICE_NAME=$DEVICE_NAME" >> $GITHUB_ENV 2025-07-17T11:08:31.7174857Z echo "DEVICE_TYPE=$DEVICE_TYPE" >> $GITHUB_ENV 2025-07-17T11:08:31.7227177Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:31.7227914Z env: 2025-07-17T11:08:31.7228348Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:31.7229155Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:31.7230339Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:31.7231405Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:31.7233215Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:31.7234832Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:31.7235398Z AWS_REGION: us-east-1 2025-07-17T11:08:31.7236420Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:31.7237173Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:31.7248618Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:31.7249441Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:31.7250338Z ##[endgroup] 2025-07-17T11:08:31.7326796Z + python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-07-17T11:08:31.9859145Z Defaulting to user installation because normal site-packages is not writeable 2025-07-17T11:08:32.0640762Z Requirement already satisfied: boto3==1.35.33 in /home/pytorchci/.local/lib/python3.10/site-packages (1.35.33) 2025-07-17T11:08:32.0644615Z Requirement already satisfied: psutil==7.0.0 in /home/pytorchci/.local/lib/python3.10/site-packages (7.0.0) 2025-07-17T11:08:32.0654092Z Requirement already satisfied: pynvml==12.0.0 in /home/pytorchci/.local/lib/python3.10/site-packages (12.0.0) 2025-07-17T11:08:32.0688041Z Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /home/pytorchci/.local/lib/python3.10/site-packages (from boto3==1.35.33) (0.10.4) 2025-07-17T11:08:32.0691888Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /home/pytorchci/.local/lib/python3.10/site-packages (from boto3==1.35.33) (1.0.1) 2025-07-17T11:08:32.0696773Z Requirement already satisfied: botocore<1.36.0,>=1.35.33 in /home/pytorchci/.local/lib/python3.10/site-packages (from boto3==1.35.33) (1.35.99) 2025-07-17T11:08:32.0848901Z Requirement already satisfied: nvidia-ml-py<13.0.0a0,>=12.0.0 in /home/pytorchci/.local/lib/python3.10/site-packages (from pynvml==12.0.0) (12.575.51) 2025-07-17T11:08:32.0891758Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /home/pytorchci/.local/lib/python3.10/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (2.9.0.post0) 2025-07-17T11:08:32.0902700Z Requirement already satisfied: urllib3!=2.2.0,<3,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.26.5) 2025-07-17T11:08:32.0946304Z Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.16.0) 2025-07-17T11:08:32.3506272Z + DEVICE_NAME= 2025-07-17T11:08:32.3506826Z + DEVICE_TYPE= 2025-07-17T11:08:32.3507326Z + command -v nvidia-smi 2025-07-17T11:08:32.3507868Z + command -v rocminfo 2025-07-17T11:08:32.3508371Z + DEVICE_NAME=rocm 2025-07-17T11:08:32.3508845Z /usr/bin/rocminfo 2025-07-17T11:08:32.3520562Z ++ rocminfo 2025-07-17T11:08:32.3523966Z ++ grep 'Marketing Name' 2025-07-17T11:08:32.3524587Z ++ tail -n1 2025-07-17T11:08:32.3528105Z ++ awk -F: '{print $2}' 2025-07-17T11:08:32.3530764Z ++ xargs 2025-07-17T11:08:32.4471327Z + DEVICE_TYPE='AMD Instinct MI210' 2025-07-17T11:08:32.4471702Z + echo DEVICE_NAME=rocm 2025-07-17T11:08:32.4472007Z + echo 'DEVICE_TYPE=AMD Instinct MI210' 2025-07-17T11:08:32.4508821Z ##[group]Run set -eux 2025-07-17T11:08:32.4509366Z set -eux 2025-07-17T11:08:32.4509818Z  2025-07-17T11:08:32.4510321Z if [[ -z "${GITHUB_TOKEN}" ]]; then 2025-07-17T11:08:32.4511019Z  echo "Missing github-token input" 2025-07-17T11:08:32.4511643Z  exit 1 2025-07-17T11:08:32.4512078Z fi 2025-07-17T11:08:32.4559380Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:32.4560277Z env: 2025-07-17T11:08:32.4560719Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:32.4561508Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:32.4562631Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:32.4563691Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:32.4565422Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:32.4567144Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:32.4568279Z AWS_REGION: us-east-1 2025-07-17T11:08:32.4569047Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:32.4569899Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:32.4577333Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:32.4577760Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:32.4578217Z DEVICE_NAME: rocm 2025-07-17T11:08:32.4578518Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:32.4579334Z GITHUB_TOKEN: *** 2025-07-17T11:08:32.4579887Z ##[endgroup] 2025-07-17T11:08:32.4662639Z + [[ -z *** ]] 2025-07-17T11:08:32.4735410Z ##[group]Run pytorch/test-infra/.github/actions/get-workflow-job-id@main 2025-07-17T11:08:32.4735857Z with: 2025-07-17T11:08:32.4736215Z github-token: *** 2025-07-17T11:08:32.4736468Z env: 2025-07-17T11:08:32.4736694Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:32.4737113Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:32.4737721Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:32.4738260Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:32.4739861Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:32.4741500Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:32.4742058Z AWS_REGION: us-east-1 2025-07-17T11:08:32.4742639Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:32.4743363Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:32.4755198Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:32.4756012Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:32.4756910Z DEVICE_NAME: rocm 2025-07-17T11:08:32.4757427Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:32.4757981Z ##[endgroup] 2025-07-17T11:08:32.4781297Z ##[group]Run set -eux 2025-07-17T11:08:32.4781588Z set -eux 2025-07-17T11:08:32.4781829Z  2025-07-17T11:08:32.4782294Z python3 "${GITHUB_ACTION_PATH}/../../scripts/get_workflow_job_id.py" "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-07-17T11:08:32.4830085Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:32.4830828Z env: 2025-07-17T11:08:32.4831271Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:32.4832079Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:32.4833241Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:32.4834385Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:32.4836146Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:32.4837740Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:32.4838314Z AWS_REGION: us-east-1 2025-07-17T11:08:32.4838931Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:32.4839762Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:32.4851979Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:32.4852783Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:32.4853314Z DEVICE_NAME: rocm 2025-07-17T11:08:32.4853570Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:32.4853979Z GITHUB_TOKEN: *** 2025-07-17T11:08:32.4854225Z ##[endgroup] 2025-07-17T11:08:32.4964608Z + python3 /home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/get-workflow-job-id/../../scripts/get_workflow_job_id.py 16337959866 pytorch-rocm-hw-43 2025-07-17T11:08:32.8881475Z setting job-id=46159470975 2025-07-17T11:08:32.8882439Z setting job-name=rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T11:08:32.9060986Z ##[group]Run set -eux 2025-07-17T11:08:32.9061548Z set -eux 2025-07-17T11:08:32.9062041Z  2025-07-17T11:08:32.9063195Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_metadata.py" \ 2025-07-17T11:08:32.9064169Z  --schema-version "${SCHEMA_VERSION}" \ 2025-07-17T11:08:32.9064877Z  --repo "${REPO}" \ 2025-07-17T11:08:32.9065488Z  --head-branch "${HEAD_BRANCH}" \ 2025-07-17T11:08:32.9066138Z  --head-sha "${HEAD_SHA}" \ 2025-07-17T11:08:32.9066801Z  --workflow-id "${WORKFLOW_RUN_ID}" \ 2025-07-17T11:08:32.9067481Z  --run-attempt "${RUN_ATTEMPT}" \ 2025-07-17T11:08:32.9068119Z  --job-id "${JOB_ID}" \ 2025-07-17T11:08:32.9068954Z  --job-name "${JOB_NAME}" 2025-07-17T11:08:32.9113089Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:32.9113845Z env: 2025-07-17T11:08:32.9114288Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:32.9115113Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:32.9116292Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:32.9117361Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:32.9119127Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:32.9120865Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:32.9121428Z AWS_REGION: us-east-1 2025-07-17T11:08:32.9122061Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:32.9122791Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:32.9133279Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:32.9133709Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:32.9134171Z DEVICE_NAME: rocm 2025-07-17T11:08:32.9134445Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:32.9134740Z SCHEMA_VERSION: v3 2025-07-17T11:08:32.9135016Z REPO: pytorch/pytorch 2025-07-17T11:08:32.9135304Z HEAD_BRANCH: refs/heads/main 2025-07-17T11:08:32.9135646Z HEAD_SHA: a38f433be2e94a64b095a44ba39879d02d0c2316 2025-07-17T11:08:32.9136015Z WORKFLOW_RUN_ID: 16337959866 2025-07-17T11:08:32.9136442Z RUN_ATTEMPT: 1 2025-07-17T11:08:32.9136694Z JOB_ID: 46159470975 2025-07-17T11:08:32.9137111Z JOB_NAME: rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) 2025-07-17T11:08:32.9137684Z ##[endgroup] 2025-07-17T11:08:32.9216888Z + python3 /home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_metadata.py --schema-version v3 --repo pytorch/pytorch --head-branch refs/heads/main --head-sha a38f433be2e94a64b095a44ba39879d02d0c2316 --workflow-id 16337959866 --run-attempt 1 --job-id 46159470975 --job-name 'rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2)' 2025-07-17T11:08:32.9511064Z ##[group]Run set -eux 2025-07-17T11:08:32.9511585Z set -eux 2025-07-17T11:08:32.9512035Z  2025-07-17T11:08:32.9512810Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_runners_info.py" 2025-07-17T11:08:32.9555861Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:32.9556616Z env: 2025-07-17T11:08:32.9557054Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:32.9557841Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:32.9559004Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:32.9560214Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:32.9562012Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:32.9563625Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:32.9564177Z AWS_REGION: us-east-1 2025-07-17T11:08:32.9564791Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:32.9565559Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:32.9575184Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:32.9575620Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:32.9576076Z DEVICE_NAME: rocm 2025-07-17T11:08:32.9576332Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:32.9576613Z ##[endgroup] 2025-07-17T11:08:32.9652047Z + python3 /home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_runners_info.py 2025-07-17T11:08:33.6910635Z /home/pytorchci/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.) 2025-07-17T11:08:33.6913505Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-07-17T11:08:34.2582548Z ##[group]Run set -eux 2025-07-17T11:08:34.2583112Z set -eux 2025-07-17T11:08:34.2583555Z  2025-07-17T11:08:34.2584094Z # TODO (huydhn): Implement this part 2025-07-17T11:08:34.2584844Z echo "dependencies={}" >> "${GITHUB_OUTPUT}" 2025-07-17T11:08:34.2637320Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:34.2638075Z env: 2025-07-17T11:08:34.2638521Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:34.2639302Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:34.2640549Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:34.2641600Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:34.2643342Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:34.2644923Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:34.2645489Z AWS_REGION: us-east-1 2025-07-17T11:08:34.2646219Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:34.2647068Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:34.2659541Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:34.2660877Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:34.2661777Z DEVICE_NAME: rocm 2025-07-17T11:08:34.2662290Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:34.2662850Z ##[endgroup] 2025-07-17T11:08:34.2738106Z + echo 'dependencies={}' 2025-07-17T11:08:34.2777958Z ##[group]Run set -eux 2025-07-17T11:08:34.2778632Z set -eux 2025-07-17T11:08:34.2779177Z  2025-07-17T11:08:34.2779809Z if [[ ! -d "${BENCHMARK_RESULTS_DIR}" ]]; then 2025-07-17T11:08:34.2780791Z  echo "${BENCHMARK_RESULTS_DIR} does not exist, skipping" 2025-07-17T11:08:34.2781722Z  # We don't want the job to fail if the directory doesn't exist 2025-07-17T11:08:34.2782455Z  exit 0 2025-07-17T11:08:34.2782903Z fi 2025-07-17T11:08:34.2783325Z  2025-07-17T11:08:34.2783791Z if [[ "${DRY_RUN}" == "true" ]]; then 2025-07-17T11:08:34.2784683Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-07-17T11:08:34.2785717Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-07-17T11:08:34.2786510Z  --metadata "${BENCHMARK_METADATA}" \ 2025-07-17T11:08:34.2787168Z  --runners "${RUNNER_INFO}" \ 2025-07-17T11:08:34.2787820Z  --dependencies "${DEPENDENCIES}" \ 2025-07-17T11:08:34.2788430Z  --dry-run 2025-07-17T11:08:34.2788919Z else 2025-07-17T11:08:34.2789616Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-07-17T11:08:34.2790608Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-07-17T11:08:34.2791388Z  --metadata "${BENCHMARK_METADATA}" \ 2025-07-17T11:08:34.2792023Z  --runners "${RUNNER_INFO}" \ 2025-07-17T11:08:34.2793070Z  --dependencies "${DEPENDENCIES}" 2025-07-17T11:08:34.2793685Z fi 2025-07-17T11:08:34.2838899Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:34.2839774Z env: 2025-07-17T11:08:34.2840209Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:34.2841022Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:34.2842146Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:34.2843186Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:34.2845227Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:34.2847034Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:34.2847678Z AWS_REGION: us-east-1 2025-07-17T11:08:34.2848386Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:34.2849243Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:34.2861620Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:34.2862432Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:34.2863312Z DEVICE_NAME: rocm 2025-07-17T11:08:34.2863806Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:34.2864409Z BENCHMARK_RESULTS_DIR: test/test-reports 2025-07-17T11:08:34.2865011Z DRY_RUN: false 2025-07-17T11:08:34.2867193Z BENCHMARK_METADATA: {"timestamp": 1752750512, "schema_version": "v3", "name": "rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "a38f433be2e94a64b095a44ba39879d02d0c2316", "workflow_id": 16337959866, "run_attempt": 1, "job_id": 46159470975} 2025-07-17T11:08:34.2870126Z RUNNER_INFO: [{"cpu_info": "x86_64", "cpu_count": 64, "avail_mem_in_gb": 125, "extra_info": {"hostname": "pytorch-rocm-hw-43"}, "name": "rocm", "type": "AMD Instinct MI210"}] 2025-07-17T11:08:34.2871357Z DEPENDENCIES: {} 2025-07-17T11:08:34.2871831Z ##[endgroup] 2025-07-17T11:08:34.2945489Z + [[ ! -d test/test-reports ]] 2025-07-17T11:08:34.2946172Z + [[ false == \t\r\u\e ]] 2025-07-17T11:08:34.2951462Z + python3 /home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py --benchmark-results-dir test/test-reports --metadata '{"timestamp": 1752750512, "schema_version": "v3", "name": "rocm-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "a38f433be2e94a64b095a44ba39879d02d0c2316", "workflow_id": 16337959866, "run_attempt": 1, "job_id": 46159470975}' --runners '[{"cpu_info": "x86_64", "cpu_count": 64, "avail_mem_in_gb": 125, "extra_info": {"hostname": "pytorch-rocm-hw-43"}, "name": "rocm", "type": "AMD Instinct MI210"}]' --dependencies '{}' 2025-07-17T11:08:34.4280260Z /home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'test_modules'}, {'test_file': 'test_ops'}, {'test_file': 'test_ops_gradients'}, {'test_file': 'test_torch'}], 'excluded': []} from test/test-reports/td_exclusions-76ccf76316b89b43db7f.json is not a benchmark record, skipping 2025-07-17T11:08:34.4283603Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-07-17T11:08:34.4287077Z /home/pytorchci/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'inductor/test_aot_inductor'}, {'test_file': 'inductor/test_torchinductor'}, {'test_file': 'inductor/test_torchinductor_opinfo'}], 'excluded': []} from test/test-reports/td_exclusions-55c316f6806c2fbf0048.json is not a benchmark record, skipping 2025-07-17T11:08:34.4290552Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-07-17T11:08:34.4519936Z Prepare all required actions 2025-07-17T11:08:34.4520782Z Getting action download info 2025-07-17T11:08:34.4575716Z ##[group]Run ./.github/actions/teardown-rocm 2025-07-17T11:08:34.4576377Z env: 2025-07-17T11:08:34.4576825Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:34.4577623Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:34.4578961Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:34.4580216Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:34.4582354Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:34.4583947Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:34.4584502Z AWS_REGION: us-east-1 2025-07-17T11:08:34.4585110Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:34.4585891Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:34.4597168Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:34.4598000Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:34.4598911Z DEVICE_NAME: rocm 2025-07-17T11:08:34.4599570Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:34.4600148Z ##[endgroup] 2025-07-17T11:08:34.4625168Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-07-17T11:08:34.4625713Z # ignore expansion of "docker ps -q" since it could be empty 2025-07-17T11:08:34.4626195Z # shellcheck disable=SC2046 2025-07-17T11:08:34.4626854Z docker stop $(docker ps -q) || true 2025-07-17T11:08:34.4627652Z # Prune all stopped containers. 2025-07-17T11:08:34.4628403Z docker container prune -f 2025-07-17T11:08:34.4674606Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:08:34.4675355Z env: 2025-07-17T11:08:34.4675795Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:08:34.4676608Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:08:34.4677750Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:08:34.4679062Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:08:34.4680951Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:08:34.4682547Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:08:34.4683112Z AWS_REGION: us-east-1 2025-07-17T11:08:34.4683713Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:08:34.4684422Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:08:34.4696772Z AWS_SESSION_TOKEN: *** 2025-07-17T11:08:34.4697605Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:08:34.4698648Z DEVICE_NAME: rocm 2025-07-17T11:08:34.4699247Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:08:34.4699899Z ##[endgroup] 2025-07-17T11:08:45.7736902Z 55cd6191418f 2025-07-17T11:09:26.6963402Z Deleted Containers: 2025-07-17T11:09:26.6964448Z 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:09:26.6965203Z 2025-07-17T11:09:26.6965774Z Total reclaimed space: 16.04GB 2025-07-17T11:09:26.7026451Z Prepare all required actions 2025-07-17T11:09:26.7063505Z ##[group]Run ./.github/actions/diskspace-cleanup 2025-07-17T11:09:26.7063866Z with: 2025-07-17T11:09:26.7064115Z diskspace-cutoff: 70 2025-07-17T11:09:26.7064400Z env: 2025-07-17T11:09:26.7064677Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:09:26.7065185Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:09:26.7065900Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:09:26.7066535Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:09:26.7067723Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:09:26.7068589Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:09:26.7068887Z AWS_REGION: us-east-1 2025-07-17T11:09:26.7069226Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:09:26.7069611Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:09:26.7075425Z AWS_SESSION_TOKEN: *** 2025-07-17T11:09:26.7075857Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:09:26.7076507Z DEVICE_NAME: rocm 2025-07-17T11:09:26.7076790Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:09:26.7077080Z ##[endgroup] 2025-07-17T11:09:26.7093380Z ##[group]Run set -ex 2025-07-17T11:09:26.7093686Z set -ex 2025-07-17T11:09:26.7093959Z diskspace_cutoff=70 2025-07-17T11:09:26.7094349Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-07-17T11:09:26.7094777Z if [ ! -d "$docker_root_dir" ]; then 2025-07-17T11:09:26.7095310Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-07-17T11:09:26.7095809Z  exit 0 2025-07-17T11:09:26.7096063Z fi 2025-07-17T11:09:26.7096489Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-07-17T11:09:26.7097339Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-07-17T11:09:26.7098060Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-07-17T11:09:26.7098475Z  docker system prune -af 2025-07-17T11:09:26.7098997Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-07-17T11:09:26.7099558Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-07-17T11:09:26.7100158Z  echo "Error: Available diskspace is less than $diskspace_cutoff percent. Not enough diskspace." 2025-07-17T11:09:26.7100700Z  echo "$msg" 2025-07-17T11:09:26.7101173Z  exit 1 2025-07-17T11:09:26.7101450Z  else 2025-07-17T11:09:26.7101752Z  difference=$((diskspace - diskspace_new)) 2025-07-17T11:09:26.7102205Z  echo "Diskspace saved: $difference percent" 2025-07-17T11:09:26.7102616Z  fi 2025-07-17T11:09:26.7102898Z fi 2025-07-17T11:09:26.7131528Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-07-17T11:09:26.7131912Z env: 2025-07-17T11:09:26.7132151Z GIT_DEFAULT_BRANCH: main 2025-07-17T11:09:26.7132570Z RUNNER_ARTIFACT_DIR: /home/pytorchci/actions-runner/_work/_temp/artifacts 2025-07-17T11:09:26.7133167Z RUNNER_TEST_RESULTS_DIR: /home/pytorchci/actions-runner/_work/_temp/test-results 2025-07-17T11:09:26.7133744Z RUNNER_DOCS_DIR: /home/pytorchci/actions-runner/_work/_temp/docs 2025-07-17T11:09:26.7134658Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --device /dev/dri --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-07-17T11:09:26.7135469Z AWS_DEFAULT_REGION: us-east-1 2025-07-17T11:09:26.7135773Z AWS_REGION: us-east-1 2025-07-17T11:09:26.7136104Z AWS_ACCESS_KEY_ID: *** 2025-07-17T11:09:26.7136474Z AWS_SECRET_ACCESS_KEY: *** 2025-07-17T11:09:26.7142269Z AWS_SESSION_TOKEN: *** 2025-07-17T11:09:26.7142782Z CONTAINER_NAME: 55cd6191418f28f7cd12146cccb866b77db2d1432a6a718c7cf4a8a10b88775c 2025-07-17T11:09:26.7143338Z DEVICE_NAME: rocm 2025-07-17T11:09:26.7143644Z DEVICE_TYPE: AMD Instinct MI210 2025-07-17T11:09:26.7143993Z ##[endgroup] 2025-07-17T11:09:26.7187463Z + diskspace_cutoff=70 2025-07-17T11:09:26.7194786Z ++ docker info -f '{{.DockerRootDir}}' 2025-07-17T11:09:26.7727601Z + docker_root_dir=/home/pytorchci/.local/share/docker 2025-07-17T11:09:26.7728844Z + '[' '!' -d /home/pytorchci/.local/share/docker ']' 2025-07-17T11:09:26.7735247Z ++ df -H --output=pcent /home/pytorchci/.local/share/docker 2025-07-17T11:09:26.7736861Z ++ sed -n 2p 2025-07-17T11:09:26.7737972Z ++ sed s/%// 2025-07-17T11:09:26.7739371Z ++ sed 's/ //' 2025-07-17T11:09:26.7756465Z + diskspace=41 2025-07-17T11:09:26.7757237Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-07-17T11:09:26.7757822Z + [[ 41 -ge 70 ]] 2025-07-17T11:09:26.7803026Z Post job cleanup. 2025-07-17T11:09:26.7843266Z Post job cleanup. 2025-07-17T11:09:26.9008185Z Post job cleanup. 2025-07-17T11:09:26.9337362Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-07-17T11:09:26.9676812Z Post job cleanup. 2025-07-17T11:09:27.0928587Z Post job cleanup. 2025-07-17T11:09:27.1015311Z Post job cleanup. 2025-07-17T11:09:27.1907028Z [command]/usr/bin/git version 2025-07-17T11:09:27.1952133Z git version 2.34.1 2025-07-17T11:09:27.1984474Z Copying '/home/pytorchci/.gitconfig' to '/home/pytorchci/actions-runner/_work/_temp/daf8bae7-4437-4376-b9d8-bceb14403785/.gitconfig' 2025-07-17T11:09:27.1994237Z Temporarily overriding HOME='/home/pytorchci/actions-runner/_work/_temp/daf8bae7-4437-4376-b9d8-bceb14403785' before making global git config changes 2025-07-17T11:09:27.1995186Z Adding repository directory to the temporary git global config as a safe directory 2025-07-17T11:09:27.2006068Z [command]/usr/bin/git config --global --add safe.directory /home/pytorchci/actions-runner/_work/pytorch/pytorch 2025-07-17T11:09:27.2052818Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-07-17T11:09:27.2098807Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-07-17T11:09:27.2451319Z Entering 'android/libs/fbjni' 2025-07-17T11:09:27.2509564Z Entering 'third_party/FP16' 2025-07-17T11:09:27.2567601Z Entering 'third_party/FXdiv' 2025-07-17T11:09:27.2607541Z Entering 'third_party/NNPACK' 2025-07-17T11:09:27.2647471Z Entering 'third_party/NVTX' 2025-07-17T11:09:27.2697400Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T11:09:27.2747658Z Entering 'third_party/XNNPACK' 2025-07-17T11:09:27.2814468Z Entering 'third_party/aiter' 2025-07-17T11:09:27.2867462Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T11:09:27.2934903Z Entering 'third_party/benchmark' 2025-07-17T11:09:27.2987356Z Entering 'third_party/composable_kernel' 2025-07-17T11:09:27.3039919Z Entering 'third_party/cpp-httplib' 2025-07-17T11:09:27.3094687Z Entering 'third_party/cpuinfo' 2025-07-17T11:09:27.3148397Z Entering 'third_party/cudnn_frontend' 2025-07-17T11:09:27.3199576Z Entering 'third_party/cutlass' 2025-07-17T11:09:27.3252557Z Entering 'third_party/fbgemm' 2025-07-17T11:09:27.3295852Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T11:09:27.3342266Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T11:09:27.3392199Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T11:09:27.3444118Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T11:09:27.3499473Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T11:09:27.3543595Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T11:09:27.3585297Z Entering 'third_party/fbgemm/external/json' 2025-07-17T11:09:27.3630796Z Entering 'third_party/flash-attention' 2025-07-17T11:09:27.3681808Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T11:09:27.3736707Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T11:09:27.3792182Z Entering 'third_party/flatbuffers' 2025-07-17T11:09:27.3843403Z Entering 'third_party/fmt' 2025-07-17T11:09:27.3896448Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T11:09:27.3950539Z Entering 'third_party/gloo' 2025-07-17T11:09:27.3995061Z Entering 'third_party/googletest' 2025-07-17T11:09:27.4044450Z Entering 'third_party/ideep' 2025-07-17T11:09:27.4089533Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T11:09:27.4144137Z Entering 'third_party/ittapi' 2025-07-17T11:09:27.4202806Z Entering 'third_party/kineto' 2025-07-17T11:09:27.4256024Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T11:09:27.4298033Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T11:09:27.4350844Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T11:09:27.4396296Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T11:09:27.4435371Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T11:09:27.4471837Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T11:09:27.4523489Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T11:09:27.4562155Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T11:09:27.4605454Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T11:09:27.4652047Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T11:09:27.4693436Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T11:09:27.4740418Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T11:09:27.4791331Z Entering 'third_party/kleidiai' 2025-07-17T11:09:27.4843908Z Entering 'third_party/mimalloc' 2025-07-17T11:09:27.4889989Z Entering 'third_party/nlohmann' 2025-07-17T11:09:27.4935623Z Entering 'third_party/onnx' 2025-07-17T11:09:27.4992916Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T11:09:27.5046800Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T11:09:27.5092644Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T11:09:27.5139621Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T11:09:27.5181677Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T11:09:27.5222383Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T11:09:27.5267428Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T11:09:27.5307755Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T11:09:27.5349770Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T11:09:27.5387271Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T11:09:27.5432719Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T11:09:27.5481513Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T11:09:27.5543798Z Entering 'third_party/pocketfft' 2025-07-17T11:09:27.5591791Z Entering 'third_party/protobuf' 2025-07-17T11:09:27.5636140Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T11:09:27.5681505Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T11:09:27.5727159Z Entering 'third_party/psimd' 2025-07-17T11:09:27.5773095Z Entering 'third_party/pthreadpool' 2025-07-17T11:09:27.5813913Z Entering 'third_party/pybind11' 2025-07-17T11:09:27.5855042Z Entering 'third_party/python-peachpy' 2025-07-17T11:09:27.5898504Z Entering 'third_party/sleef' 2025-07-17T11:09:27.5943746Z Entering 'third_party/tensorpipe' 2025-07-17T11:09:27.5984901Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T11:09:27.6025066Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T11:09:27.6064210Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T11:09:27.6104012Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T11:09:27.6143010Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T11:09:27.6220409Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-07-17T11:09:27.6250360Z http.https://github.com/.extraheader 2025-07-17T11:09:27.6260866Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-07-17T11:09:27.6301475Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-07-17T11:09:27.6615456Z Entering 'android/libs/fbjni' 2025-07-17T11:09:27.6647525Z http.https://github.com/.extraheader 2025-07-17T11:09:27.6677109Z Entering 'third_party/FP16' 2025-07-17T11:09:27.6700582Z http.https://github.com/.extraheader 2025-07-17T11:09:27.6748638Z Entering 'third_party/FXdiv' 2025-07-17T11:09:27.6776914Z http.https://github.com/.extraheader 2025-07-17T11:09:27.6817328Z Entering 'third_party/NNPACK' 2025-07-17T11:09:27.6847740Z http.https://github.com/.extraheader 2025-07-17T11:09:27.6886933Z Entering 'third_party/NVTX' 2025-07-17T11:09:27.6905479Z http.https://github.com/.extraheader 2025-07-17T11:09:27.6935680Z Entering 'third_party/VulkanMemoryAllocator' 2025-07-17T11:09:27.6967510Z http.https://github.com/.extraheader 2025-07-17T11:09:27.6994774Z Entering 'third_party/XNNPACK' 2025-07-17T11:09:27.7018975Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7070394Z Entering 'third_party/aiter' 2025-07-17T11:09:27.7099404Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7138644Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-07-17T11:09:27.7169049Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7211143Z Entering 'third_party/benchmark' 2025-07-17T11:09:27.7227777Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7260583Z Entering 'third_party/composable_kernel' 2025-07-17T11:09:27.7290353Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7326986Z Entering 'third_party/cpp-httplib' 2025-07-17T11:09:27.7355041Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7387569Z Entering 'third_party/cpuinfo' 2025-07-17T11:09:27.7407776Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7447149Z Entering 'third_party/cudnn_frontend' 2025-07-17T11:09:27.7475393Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7507482Z Entering 'third_party/cutlass' 2025-07-17T11:09:27.7528218Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7571646Z Entering 'third_party/fbgemm' 2025-07-17T11:09:27.7593168Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7636884Z Entering 'third_party/fbgemm/external/asmjit' 2025-07-17T11:09:27.7666279Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7704345Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-07-17T11:09:27.7736938Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7780739Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-07-17T11:09:27.7809622Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7845537Z Entering 'third_party/fbgemm/external/cutlass' 2025-07-17T11:09:27.7875993Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7928597Z Entering 'third_party/fbgemm/external/googletest' 2025-07-17T11:09:27.7947429Z http.https://github.com/.extraheader 2025-07-17T11:09:27.7992949Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-07-17T11:09:27.8015866Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8051125Z Entering 'third_party/fbgemm/external/json' 2025-07-17T11:09:27.8075149Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8116725Z Entering 'third_party/flash-attention' 2025-07-17T11:09:27.8143370Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8179351Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-07-17T11:09:27.8206014Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8250653Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-07-17T11:09:27.8275055Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8322143Z Entering 'third_party/flatbuffers' 2025-07-17T11:09:27.8346534Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8380883Z Entering 'third_party/fmt' 2025-07-17T11:09:27.8415470Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8449637Z Entering 'third_party/gemmlowp/gemmlowp' 2025-07-17T11:09:27.8473291Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8510154Z Entering 'third_party/gloo' 2025-07-17T11:09:27.8539323Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8581958Z Entering 'third_party/googletest' 2025-07-17T11:09:27.8616690Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8647745Z Entering 'third_party/ideep' 2025-07-17T11:09:27.8674909Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8706458Z Entering 'third_party/ideep/mkl-dnn' 2025-07-17T11:09:27.8732513Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8776329Z Entering 'third_party/ittapi' 2025-07-17T11:09:27.8804735Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8836251Z Entering 'third_party/kineto' 2025-07-17T11:09:27.8863381Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8901087Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-07-17T11:09:27.8929977Z http.https://github.com/.extraheader 2025-07-17T11:09:27.8964893Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-07-17T11:09:27.8986610Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9019773Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-07-17T11:09:27.9044516Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9092116Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-07-17T11:09:27.9116064Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9149484Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-07-17T11:09:27.9172270Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9208748Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-07-17T11:09:27.9229016Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9280215Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-07-17T11:09:27.9304067Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9342945Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-07-17T11:09:27.9367466Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9407933Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-07-17T11:09:27.9430711Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9468490Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-07-17T11:09:27.9492173Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9534499Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-07-17T11:09:27.9557185Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9590522Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-07-17T11:09:27.9612474Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9656473Z Entering 'third_party/kleidiai' 2025-07-17T11:09:27.9684759Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9716088Z Entering 'third_party/mimalloc' 2025-07-17T11:09:27.9742853Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9786961Z Entering 'third_party/nlohmann' 2025-07-17T11:09:27.9809550Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9846778Z Entering 'third_party/onnx' 2025-07-17T11:09:27.9868973Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9921072Z Entering 'third_party/onnx/third_party/pybind11' 2025-07-17T11:09:27.9945962Z http.https://github.com/.extraheader 2025-07-17T11:09:27.9988813Z Entering 'third_party/opentelemetry-cpp' 2025-07-17T11:09:28.0017498Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0057442Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-07-17T11:09:28.0079279Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0121639Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-07-17T11:09:28.0146573Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0176263Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-07-17T11:09:28.0198544Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0228925Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-07-17T11:09:28.0257818Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0290554Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-07-17T11:09:28.0311658Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0344158Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-07-17T11:09:28.0373905Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0409596Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-07-17T11:09:28.0431312Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0468004Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-07-17T11:09:28.0500457Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0538739Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-07-17T11:09:28.0566560Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0598543Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-07-17T11:09:28.0621435Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0678060Z Entering 'third_party/pocketfft' 2025-07-17T11:09:28.0710753Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0749784Z Entering 'third_party/protobuf' 2025-07-17T11:09:28.0776431Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0814814Z Entering 'third_party/protobuf/third_party/benchmark' 2025-07-17T11:09:28.0839734Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0870459Z Entering 'third_party/protobuf/third_party/googletest' 2025-07-17T11:09:28.0891977Z http.https://github.com/.extraheader 2025-07-17T11:09:28.0934069Z Entering 'third_party/psimd' 2025-07-17T11:09:28.0956739Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1008279Z Entering 'third_party/pthreadpool' 2025-07-17T11:09:28.1027169Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1065244Z Entering 'third_party/pybind11' 2025-07-17T11:09:28.1086311Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1123607Z Entering 'third_party/python-peachpy' 2025-07-17T11:09:28.1145069Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1180968Z Entering 'third_party/sleef' 2025-07-17T11:09:28.1206662Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1240282Z Entering 'third_party/tensorpipe' 2025-07-17T11:09:28.1264558Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1295808Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-07-17T11:09:28.1316789Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1352479Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-07-17T11:09:28.1372329Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1418677Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-07-17T11:09:28.1439991Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1493003Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-07-17T11:09:28.1515539Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1545761Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-07-17T11:09:28.1570660Z http.https://github.com/.extraheader 2025-07-17T11:09:28.1827848Z Cleaning up orphan processes