2024-08-06T20:44:36.4206316Z Current runner version: '2.318.0' 2024-08-06T20:44:36.4213609Z Runner name: 'i-0756c52faebfec338' 2024-08-06T20:44:36.4214615Z Runner group name: 'Default' 2024-08-06T20:44:36.4215817Z Machine name: 'ip-10-0-40-3' 2024-08-06T20:44:36.4220983Z ##[group]GITHUB_TOKEN Permissions 2024-08-06T20:44:36.4223628Z Actions: read 2024-08-06T20:44:36.4224351Z Attestations: read 2024-08-06T20:44:36.4225013Z Checks: read 2024-08-06T20:44:36.4225853Z Contents: read 2024-08-06T20:44:36.4226535Z Deployments: read 2024-08-06T20:44:36.4227191Z Discussions: read 2024-08-06T20:44:36.4227959Z Issues: read 2024-08-06T20:44:36.4228628Z Metadata: read 2024-08-06T20:44:36.4229295Z Packages: read 2024-08-06T20:44:36.4230025Z Pages: read 2024-08-06T20:44:36.4230690Z PullRequests: read 2024-08-06T20:44:36.4231389Z RepositoryProjects: read 2024-08-06T20:44:36.4232213Z SecurityEvents: read 2024-08-06T20:44:36.4232921Z Statuses: read 2024-08-06T20:44:36.4233570Z ##[endgroup] 2024-08-06T20:44:36.4236787Z Secret source: Actions 2024-08-06T20:44:36.4237712Z Prepare workflow directory 2024-08-06T20:44:36.8940795Z Prepare all required actions 2024-08-06T20:44:36.8985451Z Getting action download info 2024-08-06T20:44:37.1319547Z Download action repository 'pytorch/test-infra@main' (SHA:a1f5a89251fc4258ab59806094fe3108f7d6741a) 2024-08-06T20:44:38.6405528Z Download action repository 'pytorch/pytorch@main' (SHA:de00c7958301ce81b9716bdef5731ed40d4d14ca) 2024-08-06T20:44:51.6370692Z Download action repository 'aws-actions/configure-aws-credentials@v3' (SHA:50ac8dd1e1b10d09dac7b8727528b91bed831ac0) 2024-08-06T20:44:51.8296641Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2024-08-06T20:44:52.1052179Z Getting action download info 2024-08-06T20:44:52.1972931Z Download action repository 'malfet/checkout@silent-checkout' (SHA:e07af140b3ccefc05679e3755b9db68f4ee4589c) 2024-08-06T20:44:52.4723655Z Getting action download info 2024-08-06T20:44:52.5582088Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2024-08-06T20:44:52.8075216Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/pull/132710/merge (bf5bb5a1585a03379137fab341e87c02c77e76cd) 2024-08-06T20:44:52.8077037Z ##[group] Inputs 2024-08-06T20:44:52.8077376Z build-environment: linux-focal-cuda12.1-py3.10-gcc9-sm86 2024-08-06T20:44:52.8079147Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]} 2024-08-06T20:44:52.8081182Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:44:52.8081912Z sync-tag: 2024-08-06T20:44:52.8082604Z timeout-minutes: 240 2024-08-06T20:44:52.8082852Z use-gha: 2024-08-06T20:44:52.8083054Z dashboard-tag: 2024-08-06T20:44:52.8083290Z s3-bucket: gha-artifacts 2024-08-06T20:44:52.8083544Z aws-role-to-assume: 2024-08-06T20:44:52.8083783Z ##[endgroup] 2024-08-06T20:44:52.8084267Z Complete job name: linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:44:52.8567596Z A job started hook has been configured by the self-hosted runner administrator 2024-08-06T20:44:52.8670931Z ##[group]Run '/home/ec2-user/runner-scripts/before_job.sh' 2024-08-06T20:44:52.8681833Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:44:52.8682398Z ##[endgroup] 2024-08-06T20:44:54.7239373Z Runner Type: amz2023.linux.g5.4xlarge.nvidia.gpu 2024-08-06T20:44:54.7239893Z Instance Type: g5.4xlarge 2024-08-06T20:44:54.7240774Z AMI Name: al2023-ami-2023.5.20240701.0-kernel-6.1-x86_64 2024-08-06T20:44:54.7241340Z AMI ID: ami-06c68f701d8090592 2024-08-06T20:45:00.5765288Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2024-08-06T20:45:00.5765747Z with: 2024-08-06T20:45:00.5766467Z github-secret: *** 2024-08-06T20:45:00.5767289Z instructions: All testing is done inside the container, to start an interactive session run: docker exec -it $(docker container ps --format '{{.ID}}') bash 2024-08-06T20:45:00.5768067Z activate-with-label: false 2024-08-06T20:45:00.5768398Z label: with-ssh 2024-08-06T20:45:00.5768784Z remove-existing-keys: true 2024-08-06T20:45:00.5769138Z fail-silently: true 2024-08-06T20:45:00.5769448Z env: 2024-08-06T20:45:00.5769788Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:45:00.5770149Z ##[endgroup] 2024-08-06T20:45:00.6677447Z Please see https://github.com/pytorch/pytorch/wiki/Debugging-using-with-ssh-for-Github-Actions for more info. 2024-08-06T20:45:00.9650200Z Grabbing public ssh keys from https://github.com/drisspg.keys 2024-08-06T20:45:01.0299211Z ~/.ssh/authorized_keys file found on node, removing ~/.ssh and starting fresh 2024-08-06T20:45:01.0312638Z Public keys pulled and installed to /home/ec2-user/.ssh/authorized_keys 2024-08-06T20:45:01.0337756Z Login using: ssh ec2-user@ec2-3-82-109-234.compute-1.amazonaws.com 2024-08-06T20:45:01.0338497Z All testing is done inside the container, to start an interactive session run: 2024-08-06T20:45:01.0339121Z docker exec -it $(docker container ps --format '{{.ID}}') bash 2024-08-06T20:45:01.0467121Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2024-08-06T20:45:01.0467519Z with: 2024-08-06T20:45:01.0467715Z submodules: recursive 2024-08-06T20:45:01.0467948Z fetch-depth: 0 2024-08-06T20:45:01.0468155Z env: 2024-08-06T20:45:01.0468340Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:45:01.0468569Z ##[endgroup] 2024-08-06T20:45:01.0548125Z ##[group]Run retry () { 2024-08-06T20:45:01.0548402Z retry () { 2024-08-06T20:45:01.0548727Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2024-08-06T20:45:01.0549079Z } 2024-08-06T20:45:01.0549305Z echo "${GITHUB_WORKSPACE}" 2024-08-06T20:45:01.0549612Z if [ -z "${NO_SUDO}" ]; then 2024-08-06T20:45:01.0549932Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2024-08-06T20:45:01.0550234Z else 2024-08-06T20:45:01.0550461Z  retry rm -rf "${GITHUB_WORKSPACE}" 2024-08-06T20:45:01.0550745Z fi 2024-08-06T20:45:01.0550987Z mkdir "${GITHUB_WORKSPACE}" 2024-08-06T20:45:01.0560950Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:45:01.0561294Z env: 2024-08-06T20:45:01.0561492Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:45:01.0561724Z NO_SUDO: 2024-08-06T20:45:01.0561915Z ##[endgroup] 2024-08-06T20:45:01.0592427Z /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-06T20:45:05.7201651Z ##[group]Run malfet/checkout@silent-checkout 2024-08-06T20:45:05.7201974Z with: 2024-08-06T20:45:05.7202195Z ref: b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-06T20:45:05.7202478Z fetch-depth: 0 2024-08-06T20:45:05.7202697Z submodules: recursive 2024-08-06T20:45:05.7202929Z quiet-checkout: true 2024-08-06T20:45:05.7203164Z repository: pytorch/pytorch 2024-08-06T20:45:05.7203508Z token: *** 2024-08-06T20:45:05.7203721Z ssh-strict: true 2024-08-06T20:45:05.7203963Z persist-credentials: true 2024-08-06T20:45:05.7204228Z clean: true 2024-08-06T20:45:05.7204481Z sparse-checkout-cone-mode: true 2024-08-06T20:45:05.7204765Z lfs: false 2024-08-06T20:45:05.7204986Z set-safe-directory: true 2024-08-06T20:45:05.7205234Z env: 2024-08-06T20:45:05.7205435Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:45:05.7205667Z ##[endgroup] 2024-08-06T20:45:05.8182828Z Syncing repository: pytorch/pytorch 2024-08-06T20:45:05.8183960Z ##[group]Getting Git version info 2024-08-06T20:45:05.8184373Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2024-08-06T20:45:05.8185367Z [command]/usr/bin/git version 2024-08-06T20:45:05.8187332Z git version 2.40.1 2024-08-06T20:45:05.8208546Z ##[endgroup] 2024-08-06T20:45:05.8221656Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/928b79ee-4554-47a3-a61e-f1ddc27fa1cc' before making global git config changes 2024-08-06T20:45:05.8222520Z Adding repository directory to the temporary git global config as a safe directory 2024-08-06T20:45:05.8226355Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-06T20:45:05.8280950Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2024-08-06T20:45:05.8284564Z ##[group]Initializing the repository 2024-08-06T20:45:05.8287179Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-06T20:45:05.8333574Z hint: Using 'master' as the name for the initial branch. This default branch name 2024-08-06T20:45:05.8334127Z hint: is subject to change. To configure the initial branch name to use in all 2024-08-06T20:45:05.8334634Z hint: of your new repositories, which will suppress this warning, call: 2024-08-06T20:45:05.8335070Z hint: 2024-08-06T20:45:05.8335548Z hint: git config --global init.defaultBranch 2024-08-06T20:45:05.8335914Z hint: 2024-08-06T20:45:05.8336218Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2024-08-06T20:45:05.8336733Z hint: 'development'. The just-created branch can be renamed via this command: 2024-08-06T20:45:05.8337426Z hint: 2024-08-06T20:45:05.8339100Z hint: git branch -m 2024-08-06T20:45:05.8339560Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2024-08-06T20:45:05.8346569Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2024-08-06T20:45:05.8389955Z ##[endgroup] 2024-08-06T20:45:05.8390358Z ##[group]Disabling automatic garbage collection 2024-08-06T20:45:05.8393050Z [command]/usr/bin/git config --local gc.auto 0 2024-08-06T20:45:05.8435105Z ##[endgroup] 2024-08-06T20:45:05.8435608Z ##[group]Setting up auth 2024-08-06T20:45:05.8440988Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2024-08-06T20:45:05.8483683Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2024-08-06T20:45:05.8835291Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2024-08-06T20:45:05.8877753Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2024-08-06T20:45:05.9232457Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2024-08-06T20:45:05.9294482Z ##[endgroup] 2024-08-06T20:45:05.9295236Z ##[group]Fetching the repository 2024-08-06T20:45:05.9300641Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --progress --no-recurse-submodules --quiet origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2024-08-06T20:45:08.1340507Z remote: Enumerating objects: 1008813 2024-08-06T20:45:08.1341126Z remote: Enumerating objects: 1009160, done. 2024-08-06T20:45:08.1345384Z remote: Counting objects: 0% (1/347) 2024-08-06T20:45:08.1345915Z remote: Counting objects: 1% (4/347) 2024-08-06T20:45:08.1346428Z remote: Counting objects: 2% (7/347) 2024-08-06T20:45:08.1346873Z remote: Counting objects: 3% (11/347) 2024-08-06T20:45:08.1347349Z remote: Counting objects: 4% (14/347) 2024-08-06T20:45:08.1347808Z remote: Counting objects: 5% (18/347) 2024-08-06T20:45:08.1348260Z remote: Counting objects: 6% (21/347) 2024-08-06T20:45:08.1348763Z remote: Counting objects: 7% (25/347) 2024-08-06T20:45:08.1349524Z remote: Counting objects: 8% (28/347) 2024-08-06T20:45:08.1350018Z remote: Counting objects: 9% (32/347) 2024-08-06T20:45:08.1350379Z remote: Counting objects: 10% (35/347) 2024-08-06T20:45:08.1350708Z remote: Counting objects: 11% (39/347) 2024-08-06T20:45:08.1351034Z remote: Counting objects: 12% (42/347) 2024-08-06T20:45:08.1351356Z remote: Counting objects: 13% (46/347) 2024-08-06T20:45:08.1351682Z remote: Counting objects: 14% (49/347) 2024-08-06T20:45:08.1352018Z remote: Counting objects: 15% (53/347) 2024-08-06T20:45:08.1352337Z remote: Counting objects: 16% (56/347) 2024-08-06T20:45:08.1352665Z remote: Counting objects: 17% (59/347) 2024-08-06T20:45:08.1352994Z remote: Counting objects: 18% (63/347) 2024-08-06T20:45:08.1353318Z remote: Counting objects: 19% (66/347) 2024-08-06T20:45:08.1353649Z remote: Counting objects: 20% (70/347) 2024-08-06T20:45:08.1353982Z remote: Counting objects: 21% (73/347) 2024-08-06T20:45:08.1354307Z remote: Counting objects: 22% (77/347) 2024-08-06T20:45:08.1354638Z remote: Counting objects: 23% (80/347) 2024-08-06T20:45:08.1354963Z remote: Counting objects: 24% (84/347) 2024-08-06T20:45:08.1355285Z remote: Counting objects: 25% (87/347) 2024-08-06T20:45:08.1355613Z remote: Counting objects: 26% (91/347) 2024-08-06T20:45:08.1356044Z remote: Counting objects: 27% (94/347) 2024-08-06T20:45:08.1356562Z remote: Counting objects: 28% (98/347) 2024-08-06T20:45:08.1356898Z remote: Counting objects: 29% (101/347) 2024-08-06T20:45:08.1357231Z remote: Counting objects: 30% (105/347) 2024-08-06T20:45:08.1357560Z remote: Counting objects: 31% (108/347) 2024-08-06T20:45:08.1357887Z remote: Counting objects: 32% (112/347) 2024-08-06T20:45:08.1358213Z remote: Counting objects: 33% (115/347) 2024-08-06T20:45:08.1358533Z remote: Counting objects: 34% (118/347) 2024-08-06T20:45:08.1358867Z remote: Counting objects: 35% (122/347) 2024-08-06T20:45:08.1359193Z remote: Counting objects: 36% (125/347) 2024-08-06T20:45:08.1359516Z remote: Counting objects: 37% (129/347) 2024-08-06T20:45:08.1359844Z remote: Counting objects: 38% (132/347) 2024-08-06T20:45:08.1360169Z remote: Counting objects: 39% (136/347) 2024-08-06T20:45:08.1360491Z remote: Counting objects: 40% (139/347) 2024-08-06T20:45:08.1360827Z remote: Counting objects: 41% (143/347) 2024-08-06T20:45:08.1361155Z remote: Counting objects: 42% (146/347) 2024-08-06T20:45:08.1361477Z remote: Counting objects: 43% (150/347) 2024-08-06T20:45:08.1361806Z remote: Counting objects: 44% (153/347) 2024-08-06T20:45:08.1362135Z remote: Counting objects: 45% (157/347) 2024-08-06T20:45:08.1362457Z remote: Counting objects: 46% (160/347) 2024-08-06T20:45:08.1362785Z remote: Counting objects: 47% (164/347) 2024-08-06T20:45:08.1363119Z remote: Counting objects: 48% (167/347) 2024-08-06T20:45:08.1363442Z remote: Counting objects: 49% (171/347) 2024-08-06T20:45:08.1363768Z remote: Counting objects: 50% (174/347) 2024-08-06T20:45:08.1364096Z remote: Counting objects: 51% (177/347) 2024-08-06T20:45:08.1364418Z remote: Counting objects: 52% (181/347) 2024-08-06T20:45:08.1364748Z remote: Counting objects: 53% (184/347) 2024-08-06T20:45:08.1365075Z remote: Counting objects: 54% (188/347) 2024-08-06T20:45:08.1365400Z remote: Counting objects: 55% (191/347) 2024-08-06T20:45:08.1365730Z remote: Counting objects: 56% (195/347) 2024-08-06T20:45:08.1366063Z remote: Counting objects: 57% (198/347) 2024-08-06T20:45:08.1366387Z remote: Counting objects: 58% (202/347) 2024-08-06T20:45:08.1366721Z remote: Counting objects: 59% (205/347) 2024-08-06T20:45:08.1367053Z remote: Counting objects: 60% (209/347) 2024-08-06T20:45:08.1367479Z remote: Counting objects: 61% (212/347) 2024-08-06T20:45:08.1367808Z remote: Counting objects: 62% (216/347) 2024-08-06T20:45:08.1368157Z remote: Counting objects: 63% (219/347) 2024-08-06T20:45:08.1368486Z remote: Counting objects: 64% (223/347) 2024-08-06T20:45:08.1368808Z remote: Counting objects: 65% (226/347) 2024-08-06T20:45:08.1369138Z remote: Counting objects: 66% (230/347) 2024-08-06T20:45:08.1369470Z remote: Counting objects: 67% (233/347) 2024-08-06T20:45:08.1369797Z remote: Counting objects: 68% (236/347) 2024-08-06T20:45:08.1370194Z remote: Counting objects: 69% (240/347) 2024-08-06T20:45:08.1370520Z remote: Counting objects: 70% (243/347) 2024-08-06T20:45:08.1370843Z remote: Counting objects: 71% (247/347) 2024-08-06T20:45:08.1371171Z remote: Counting objects: 72% (250/347) 2024-08-06T20:45:08.1371505Z remote: Counting objects: 73% (254/347) 2024-08-06T20:45:08.1371833Z remote: Counting objects: 74% (257/347) 2024-08-06T20:45:08.1372159Z remote: Counting objects: 75% (261/347) 2024-08-06T20:45:08.1372486Z remote: Counting objects: 76% (264/347) 2024-08-06T20:45:08.1372804Z remote: Counting objects: 77% (268/347) 2024-08-06T20:45:08.1373128Z remote: Counting objects: 78% (271/347) 2024-08-06T20:45:08.1373455Z remote: Counting objects: 79% (275/347) 2024-08-06T20:45:08.1373775Z remote: Counting objects: 80% (278/347) 2024-08-06T20:45:08.1374206Z remote: Counting objects: 81% (282/347) 2024-08-06T20:45:08.1374540Z remote: Counting objects: 82% (285/347) 2024-08-06T20:45:08.1375001Z remote: Counting objects: 83% (289/347) 2024-08-06T20:45:08.1375328Z remote: Counting objects: 84% (292/347) 2024-08-06T20:45:08.1375658Z remote: Counting objects: 85% (295/347) 2024-08-06T20:45:08.1375980Z remote: Counting objects: 86% (299/347) 2024-08-06T20:45:08.1376316Z remote: Counting objects: 87% (302/347) 2024-08-06T20:45:08.1376648Z remote: Counting objects: 88% (306/347) 2024-08-06T20:45:08.1376972Z remote: Counting objects: 89% (309/347) 2024-08-06T20:45:08.1377304Z remote: Counting objects: 90% (313/347) 2024-08-06T20:45:08.1377631Z remote: Counting objects: 91% (316/347) 2024-08-06T20:45:08.1377956Z remote: Counting objects: 92% (320/347) 2024-08-06T20:45:08.1378283Z remote: Counting objects: 93% (323/347) 2024-08-06T20:45:08.1378641Z remote: Counting objects: 94% (327/347) 2024-08-06T20:45:08.1378966Z remote: Counting objects: 95% (330/347) 2024-08-06T20:45:08.1379315Z remote: Counting objects: 96% (334/347) 2024-08-06T20:45:08.1379646Z remote: Counting objects: 97% (337/347) 2024-08-06T20:45:08.1379970Z remote: Counting objects: 98% (341/347) 2024-08-06T20:45:08.1380303Z remote: Counting objects: 99% (344/347) 2024-08-06T20:45:08.1380637Z remote: Counting objects: 100% (347/347) 2024-08-06T20:45:08.1380987Z remote: Counting objects: 100% (347/347), done. 2024-08-06T20:45:08.1408195Z remote: Compressing objects: 0% (1/202) 2024-08-06T20:45:08.1449566Z remote: Compressing objects: 1% (3/202) 2024-08-06T20:45:08.1537740Z remote: Compressing objects: 2% (5/202) 2024-08-06T20:45:08.1579427Z remote: Compressing objects: 3% (7/202) 2024-08-06T20:45:08.1617408Z remote: Compressing objects: 4% (9/202) 2024-08-06T20:45:08.1819245Z remote: Compressing objects: 5% (11/202) 2024-08-06T20:45:08.1870910Z remote: Compressing objects: 6% (13/202) 2024-08-06T20:45:08.1922664Z remote: Compressing objects: 7% (15/202) 2024-08-06T20:45:08.1946247Z remote: Compressing objects: 8% (17/202) 2024-08-06T20:45:08.1980160Z remote: Compressing objects: 9% (19/202) 2024-08-06T20:45:08.2090338Z remote: Compressing objects: 10% (21/202) 2024-08-06T20:45:08.2091582Z remote: Compressing objects: 11% (23/202) 2024-08-06T20:45:08.2091924Z remote: Compressing objects: 12% (25/202) 2024-08-06T20:45:08.2092460Z remote: Compressing objects: 13% (27/202) 2024-08-06T20:45:08.2092798Z remote: Compressing objects: 14% (29/202) 2024-08-06T20:45:08.2093134Z remote: Compressing objects: 15% (31/202) 2024-08-06T20:45:08.2093464Z remote: Compressing objects: 16% (33/202) 2024-08-06T20:45:08.2093800Z remote: Compressing objects: 17% (35/202) 2024-08-06T20:45:08.2094140Z remote: Compressing objects: 18% (37/202) 2024-08-06T20:45:08.2094470Z remote: Compressing objects: 19% (39/202) 2024-08-06T20:45:08.2094868Z remote: Compressing objects: 20% (41/202) 2024-08-06T20:45:08.2095206Z remote: Compressing objects: 21% (43/202) 2024-08-06T20:45:08.2095539Z remote: Compressing objects: 22% (45/202) 2024-08-06T20:45:08.2095874Z remote: Compressing objects: 23% (47/202) 2024-08-06T20:45:08.2096224Z remote: Compressing objects: 24% (49/202) 2024-08-06T20:45:08.2096554Z remote: Compressing objects: 25% (51/202) 2024-08-06T20:45:08.2099016Z remote: Compressing objects: 26% (53/202) 2024-08-06T20:45:08.2099693Z remote: Compressing objects: 27% (55/202) 2024-08-06T20:45:08.2101004Z remote: Compressing objects: 28% (57/202) 2024-08-06T20:45:08.2101340Z remote: Compressing objects: 29% (59/202) 2024-08-06T20:45:08.2104140Z remote: Compressing objects: 30% (61/202) 2024-08-06T20:45:08.2105674Z remote: Compressing objects: 31% (63/202) 2024-08-06T20:45:08.2107219Z remote: Compressing objects: 32% (65/202) 2024-08-06T20:45:08.2111573Z remote: Compressing objects: 33% (67/202) 2024-08-06T20:45:08.2112183Z remote: Compressing objects: 34% (69/202) 2024-08-06T20:45:08.2112737Z remote: Compressing objects: 35% (71/202) 2024-08-06T20:45:08.2115051Z remote: Compressing objects: 36% (73/202) 2024-08-06T20:45:08.2116946Z remote: Compressing objects: 37% (75/202) 2024-08-06T20:45:08.2117433Z remote: Compressing objects: 38% (77/202) 2024-08-06T20:45:08.2120082Z remote: Compressing objects: 39% (79/202) 2024-08-06T20:45:08.2122772Z remote: Compressing objects: 40% (81/202) 2024-08-06T20:45:08.2126020Z remote: Compressing objects: 41% (83/202) 2024-08-06T20:45:08.2127070Z remote: Compressing objects: 42% (85/202) 2024-08-06T20:45:08.2129315Z remote: Compressing objects: 43% (87/202) 2024-08-06T20:45:08.2129904Z remote: Compressing objects: 44% (89/202) 2024-08-06T20:45:08.2131946Z remote: Compressing objects: 45% (91/202) 2024-08-06T20:45:08.2134188Z remote: Compressing objects: 46% (93/202) 2024-08-06T20:45:08.2136145Z remote: Compressing objects: 47% (95/202) 2024-08-06T20:45:08.2136694Z remote: Compressing objects: 48% (97/202) 2024-08-06T20:45:08.2138632Z remote: Compressing objects: 49% (99/202) 2024-08-06T20:45:08.2140798Z remote: Compressing objects: 50% (101/202) 2024-08-06T20:45:08.2141377Z remote: Compressing objects: 51% (104/202) 2024-08-06T20:45:08.2141916Z remote: Compressing objects: 52% (106/202) 2024-08-06T20:45:08.2142449Z remote: Compressing objects: 53% (108/202) 2024-08-06T20:45:08.2142894Z remote: Compressing objects: 54% (110/202) 2024-08-06T20:45:08.2143239Z remote: Compressing objects: 55% (112/202) 2024-08-06T20:45:08.2143587Z remote: Compressing objects: 56% (114/202) 2024-08-06T20:45:08.2144057Z remote: Compressing objects: 57% (116/202) 2024-08-06T20:45:08.2144567Z remote: Compressing objects: 58% (118/202) 2024-08-06T20:45:08.2144908Z remote: Compressing objects: 59% (120/202) 2024-08-06T20:45:08.2145334Z remote: Compressing objects: 60% (122/202) 2024-08-06T20:45:08.2145890Z remote: Compressing objects: 61% (124/202) 2024-08-06T20:45:08.2146571Z remote: Compressing objects: 62% (126/202) 2024-08-06T20:45:08.2146972Z remote: Compressing objects: 63% (128/202) 2024-08-06T20:45:08.2147547Z remote: Compressing objects: 64% (130/202) 2024-08-06T20:45:08.2147901Z remote: Compressing objects: 65% (132/202) 2024-08-06T20:45:08.2148234Z remote: Compressing objects: 66% (134/202) 2024-08-06T20:45:08.2148638Z remote: Compressing objects: 67% (136/202) 2024-08-06T20:45:08.2149227Z remote: Compressing objects: 68% (138/202) 2024-08-06T20:45:08.2149765Z remote: Compressing objects: 69% (140/202) 2024-08-06T20:45:08.2150216Z remote: Compressing objects: 70% (142/202) 2024-08-06T20:45:08.2150554Z remote: Compressing objects: 71% (144/202) 2024-08-06T20:45:08.2150889Z remote: Compressing objects: 72% (146/202) 2024-08-06T20:45:08.2160459Z remote: Compressing objects: 73% (148/202) 2024-08-06T20:45:08.2161025Z remote: Compressing objects: 74% (150/202) 2024-08-06T20:45:08.2161590Z remote: Compressing objects: 75% (152/202) 2024-08-06T20:45:08.2161971Z remote: Compressing objects: 76% (154/202) 2024-08-06T20:45:08.2162307Z remote: Compressing objects: 77% (156/202) 2024-08-06T20:45:08.2162648Z remote: Compressing objects: 78% (158/202) 2024-08-06T20:45:08.2162979Z remote: Compressing objects: 79% (160/202) 2024-08-06T20:45:08.2163315Z remote: Compressing objects: 80% (162/202) 2024-08-06T20:45:08.2163817Z remote: Compressing objects: 81% (164/202) 2024-08-06T20:45:08.2164161Z remote: Compressing objects: 82% (166/202) 2024-08-06T20:45:08.2164490Z remote: Compressing objects: 83% (168/202) 2024-08-06T20:45:08.2164838Z remote: Compressing objects: 84% (170/202) 2024-08-06T20:45:08.2165177Z remote: Compressing objects: 85% (172/202) 2024-08-06T20:45:08.2165507Z remote: Compressing objects: 86% (174/202) 2024-08-06T20:45:08.2165853Z remote: Compressing objects: 87% (176/202) 2024-08-06T20:45:08.2166193Z remote: Compressing objects: 88% (178/202) 2024-08-06T20:45:08.2166569Z remote: Compressing objects: 89% (180/202) 2024-08-06T20:45:08.2166907Z remote: Compressing objects: 90% (182/202) 2024-08-06T20:45:08.2167256Z remote: Compressing objects: 91% (184/202) 2024-08-06T20:45:08.2167598Z remote: Compressing objects: 92% (186/202) 2024-08-06T20:45:08.2167933Z remote: Compressing objects: 93% (188/202) 2024-08-06T20:45:08.2168285Z remote: Compressing objects: 94% (190/202) 2024-08-06T20:45:08.2168631Z remote: Compressing objects: 95% (192/202) 2024-08-06T20:45:08.2168979Z remote: Compressing objects: 96% (194/202) 2024-08-06T20:45:08.2169319Z remote: Compressing objects: 97% (196/202) 2024-08-06T20:45:08.2169668Z remote: Compressing objects: 98% (198/202) 2024-08-06T20:45:08.2170011Z remote: Compressing objects: 99% (200/202) 2024-08-06T20:45:08.2170357Z remote: Compressing objects: 100% (202/202) 2024-08-06T20:45:08.2170730Z remote: Compressing objects: 100% (202/202), done. 2024-08-06T20:45:29.7451406Z remote: Total 1009160 (delta 172), reused 275 (delta 143), pack-reused 1008813 2024-08-06T20:45:52.2018472Z [command]/usr/bin/git rev-parse --verify --quiet b9d86fa89636e301796d4201f36d86c73f6e49bc^{object} 2024-08-06T20:45:52.2054982Z b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-06T20:45:52.2060573Z ##[endgroup] 2024-08-06T20:45:52.2061122Z ##[group]Determining the checkout info 2024-08-06T20:45:52.2061702Z ##[endgroup] 2024-08-06T20:45:52.2062172Z ##[group]Checking out the ref 2024-08-06T20:45:52.2065003Z [command]/usr/bin/git checkout --quiet --force b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-06T20:45:53.9309100Z ##[endgroup] 2024-08-06T20:45:53.9309499Z ##[group]Setting up auth for fetching submodules 2024-08-06T20:45:53.9310205Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2024-08-06T20:45:53.9371975Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2024-08-06T20:45:53.9412839Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2024-08-06T20:45:53.9454056Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2024-08-06T20:45:53.9493233Z ##[endgroup] 2024-08-06T20:45:53.9493729Z ##[group]Fetching submodules 2024-08-06T20:45:53.9497333Z [command]/usr/bin/git submodule sync --recursive 2024-08-06T20:45:53.9877939Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2024-08-06T20:45:54.0246881Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2024-08-06T20:45:54.0248216Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2024-08-06T20:45:54.0252031Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2024-08-06T20:45:54.0255819Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2024-08-06T20:45:54.0259836Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2024-08-06T20:45:54.0263619Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2024-08-06T20:45:54.0267226Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2024-08-06T20:45:54.0271099Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2024-08-06T20:45:54.0275015Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2024-08-06T20:45:54.0279300Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2024-08-06T20:45:54.0283205Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2024-08-06T20:45:54.0287392Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2024-08-06T20:45:54.0291821Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2024-08-06T20:45:54.0296732Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2024-08-06T20:45:54.0300922Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2024-08-06T20:45:54.0306631Z Submodule 'third_party/foxi' (https://github.com/houseroad/foxi.git) registered for path 'third_party/foxi' 2024-08-06T20:45:54.0314824Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2024-08-06T20:45:54.0319391Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2024-08-06T20:45:54.0324228Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2024-08-06T20:45:54.0329043Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2024-08-06T20:45:54.0333853Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2024-08-06T20:45:54.0339096Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2024-08-06T20:45:54.0344135Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2024-08-06T20:45:54.0349088Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2024-08-06T20:45:54.0354371Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2024-08-06T20:45:54.0359319Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2024-08-06T20:45:54.0364823Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2024-08-06T20:45:54.0369979Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2024-08-06T20:45:54.0375563Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2024-08-06T20:45:54.0381111Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2024-08-06T20:45:54.0386673Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2024-08-06T20:45:54.0392488Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2024-08-06T20:45:54.0398306Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2024-08-06T20:45:54.0407598Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2024-08-06T20:45:54.0413351Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2024-08-06T20:45:54.0448676Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2024-08-06T20:45:54.3447186Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2024-08-06T20:45:54.5116146Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2024-08-06T20:45:54.8096831Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2024-08-06T20:45:55.0726291Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2024-08-06T20:45:56.9368358Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2024-08-06T20:46:07.8571265Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2024-08-06T20:46:08.2758555Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpp-httplib'... 2024-08-06T20:46:08.8811584Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2024-08-06T20:46:09.4745934Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2024-08-06T20:46:10.6789815Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2024-08-06T20:46:12.5877423Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2024-08-06T20:46:18.5060915Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2024-08-06T20:46:19.7590272Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2024-08-06T20:46:21.4330214Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2024-08-06T20:46:22.7238539Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/foxi'... 2024-08-06T20:46:22.9089442Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2024-08-06T20:46:23.3229446Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2024-08-06T20:46:23.6428944Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2024-08-06T20:46:24.8048076Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2024-08-06T20:46:25.1605960Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2024-08-06T20:46:25.4281298Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2024-08-06T20:46:26.8979860Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/mimalloc'... 2024-08-06T20:46:27.6254692Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2024-08-06T20:46:28.3337126Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2024-08-06T20:46:34.3560044Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2024-08-06T20:46:36.2788819Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2024-08-06T20:46:40.1694726Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2024-08-06T20:46:40.4332168Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2024-08-06T20:46:48.8298791Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2024-08-06T20:46:49.0056662Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2024-08-06T20:46:49.2825093Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2024-08-06T20:46:50.1944208Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2024-08-06T20:46:50.4904794Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2024-08-06T20:46:51.0903172Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2024-08-06T20:46:51.5245900Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2024-08-06T20:46:51.5394684Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2024-08-06T20:46:51.5517619Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2024-08-06T20:46:51.5836835Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2024-08-06T20:46:51.6299266Z Submodule path 'third_party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191' 2024-08-06T20:46:52.7928273Z Submodule path 'third_party/XNNPACK': checked out 'fcbf55af6cf28a4627bcd1f703ab7ad843f0f3a2' 2024-08-06T20:46:52.8223793Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2024-08-06T20:46:52.8750279Z Submodule path 'third_party/cpp-httplib': checked out '3b6597bba913d51161383657829b7e644e59c006' 2024-08-06T20:46:52.9882101Z Submodule path 'third_party/cpuinfo': checked out '3c8b1533ac03dd6531ab6e7b9245d488f13a82a5' 2024-08-06T20:46:53.0286193Z Submodule path 'third_party/cudnn_frontend': checked out '98ca4e1941fe3263f128f74f10063a3ea35c7019' 2024-08-06T20:46:53.6540666Z Submodule path 'third_party/cutlass': checked out 'bbe579a9e3beb6ea6626d9227ec32d0dae119a49' 2024-08-06T20:46:53.9424610Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2024-08-06T20:46:54.0354029Z Submodule path 'third_party/fbgemm': checked out 'dbc3157bf256f1339b3fa1fef2be89ac4078be0e' 2024-08-06T20:46:54.0373083Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2024-08-06T20:46:54.0377302Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T20:46:54.0380633Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/fbgemm/third_party/cutlass' 2024-08-06T20:46:54.0384216Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2024-08-06T20:46:54.0387789Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T20:46:54.0420703Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2024-08-06T20:46:55.2388452Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2024-08-06T20:46:55.8470510Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cutlass'... 2024-08-06T20:46:57.8407185Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2024-08-06T20:46:58.9164893Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2024-08-06T20:46:59.2548000Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2024-08-06T20:46:59.3656198Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2024-08-06T20:46:59.8766020Z Submodule path 'third_party/fbgemm/third_party/cutlass': checked out 'fc9ebc645b63f3a6bc80aaefde5c063fb72110d6' 2024-08-06T20:46:59.9474356Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2024-08-06T20:46:59.9627174Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '23f53b025b466d8ec3c45d52290d3442f7fbe6b1' 2024-08-06T20:47:00.1114410Z Submodule path 'third_party/flatbuffers': checked out '01834de25e4bf3975a9a00e816292b1ad0fe184b' 2024-08-06T20:47:00.1580546Z Submodule path 'third_party/fmt': checked out '0c9fce2ffefecfdce794e1859584e25877b7b592' 2024-08-06T20:47:00.1706265Z Submodule path 'third_party/foxi': checked out 'c278588e34e535f0bb8f00df3880d26928038cad' 2024-08-06T20:47:00.2167275Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2024-08-06T20:47:00.2492345Z Submodule path 'third_party/gloo': checked out '5354032ea08eadd7fc4456477f7f7c6308818509' 2024-08-06T20:47:00.3029976Z Submodule path 'third_party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2024-08-06T20:47:00.3188817Z Submodule path 'third_party/ideep': checked out '55ca0191687aaf19aca5cdb7881c791e3bea442b' 2024-08-06T20:47:00.3208447Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2024-08-06T20:47:00.3238315Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2024-08-06T20:47:13.0090058Z Submodule path 'third_party/ideep/mkl-dnn': checked out '1137e04ec0b5251ca2b4400a4fd3c667ce843d67' 2024-08-06T20:47:13.0312242Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2024-08-06T20:47:13.1305646Z Submodule path 'third_party/kineto': checked out 'da2f2682cabaf95d601fa2a9b7e0979f84fe7667' 2024-08-06T20:47:13.1325063Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T20:47:13.1328431Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T20:47:13.1331975Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T20:47:13.1364150Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2024-08-06T20:47:14.9633250Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2024-08-06T20:47:16.1035193Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2024-08-06T20:47:17.2766240Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2024-08-06T20:47:17.2787363Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T20:47:17.2790838Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T20:47:17.2796424Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T20:47:17.2799985Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T20:47:17.2803786Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T20:47:17.2807604Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T20:47:17.2811447Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T20:47:17.2815745Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T20:47:17.2847382Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2024-08-06T20:47:18.1944014Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2024-08-06T20:47:18.5654246Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2024-08-06T20:47:19.7129928Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2024-08-06T20:47:19.9913337Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2024-08-06T20:47:20.5243949Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2024-08-06T20:47:21.5899057Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2024-08-06T20:47:29.0267946Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2024-08-06T20:47:29.5189646Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2024-08-06T20:47:29.5433349Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2024-08-06T20:47:29.5876271Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2024-08-06T20:47:29.6045784Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2024-08-06T20:47:29.6066305Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T20:47:29.6099398Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2024-08-06T20:47:30.0371841Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2024-08-06T20:47:30.0599698Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2024-08-06T20:47:30.1085605Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2024-08-06T20:47:30.2364614Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2024-08-06T20:47:30.2573413Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2024-08-06T20:47:30.3067523Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2024-08-06T20:47:30.3771140Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2024-08-06T20:47:30.4227591Z Submodule path 'third_party/mimalloc': checked out 'b66e3214d8a104669c2ec05ae91ebc26a8f5ab78' 2024-08-06T20:47:30.4530002Z Submodule path 'third_party/nccl/nccl': checked out 'ab2b89c4c339bd7f816fbc114a4b05d386b66290' 2024-08-06T20:47:30.5790602Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2024-08-06T20:47:31.0939716Z Submodule path 'third_party/onnx': checked out '3bf92c03a9f27eba3bda1e5b9e63ea20ec213557' 2024-08-06T20:47:31.0978577Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark' 2024-08-06T20:47:31.0981832Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2024-08-06T20:47:31.1015803Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/benchmark'... 2024-08-06T20:47:31.5705751Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2024-08-06T20:47:32.4674811Z Submodule path 'third_party/onnx/third_party/benchmark': checked out '2dd015dfef425c866d9a43f2c67d8b52d709acb6' 2024-08-06T20:47:32.5082850Z Submodule path 'third_party/onnx/third_party/pybind11': checked out '5b0a6fc2017fcc176545afe3e09c9f9885283242' 2024-08-06T20:47:32.5993199Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2024-08-06T20:47:32.6017910Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T20:47:32.6021167Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T20:47:32.6024584Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T20:47:32.6028127Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T20:47:32.6031979Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T20:47:32.6035487Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T20:47:32.6039129Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T20:47:32.6042948Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T20:47:32.6075125Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2024-08-06T20:47:33.0115997Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2024-08-06T20:47:34.0714739Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2024-08-06T20:47:34.4827732Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2024-08-06T20:47:40.4852227Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2024-08-06T20:47:40.7425754Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2024-08-06T20:47:40.9530650Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2024-08-06T20:47:41.2479235Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2024-08-06T20:47:48.7499661Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2024-08-06T20:47:48.7978289Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2024-08-06T20:47:48.8169155Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2024-08-06T20:47:48.9407240Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2024-08-06T20:47:48.9568884Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2024-08-06T20:47:48.9760342Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2024-08-06T20:47:48.9968723Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2024-08-06T20:47:48.9988162Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T20:47:48.9991739Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T20:47:49.0023586Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2024-08-06T20:47:50.8987674Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2024-08-06T20:47:52.2187658Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2024-08-06T20:47:52.2732510Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2024-08-06T20:47:52.9494557Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2024-08-06T20:47:52.9642824Z Submodule path 'third_party/pocketfft': checked out '9d3ab05a7fffbc71a492bc6a17be034e83e8f0fe' 2024-08-06T20:47:53.2841612Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2024-08-06T20:47:53.2864762Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2024-08-06T20:47:53.2868156Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2024-08-06T20:47:53.2901462Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2024-08-06T20:47:53.7860986Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2024-08-06T20:47:54.8669010Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2024-08-06T20:47:54.9482920Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2024-08-06T20:47:54.9602538Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2024-08-06T20:47:54.9756173Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2024-08-06T20:47:55.0213449Z Submodule path 'third_party/pybind11': checked out '941f45bcb51457884fa1afd6e24a67377d70f75c' 2024-08-06T20:47:55.0558495Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2024-08-06T20:47:55.1050469Z Submodule path 'third_party/sleef': checked out '60e76d2bce17d278b439d9da17177c8f957a9e9b' 2024-08-06T20:47:55.1401749Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2024-08-06T20:47:55.1421566Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2024-08-06T20:47:55.1425075Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2024-08-06T20:47:55.1428365Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2024-08-06T20:47:55.1431853Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T20:47:55.1464125Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2024-08-06T20:47:56.2084742Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2024-08-06T20:47:56.4414546Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2024-08-06T20:47:57.6450991Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2024-08-06T20:47:58.5798556Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2024-08-06T20:47:58.5992741Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2024-08-06T20:47:58.6794165Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2024-08-06T20:47:58.7147060Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2024-08-06T20:47:58.7167276Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T20:47:58.7198546Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2024-08-06T20:47:58.9464308Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2024-08-06T20:47:58.9509908Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2024-08-06T20:47:58.9882706Z Entering 'android/libs/fbjni' 2024-08-06T20:47:58.9934343Z Entering 'third_party/FP16' 2024-08-06T20:47:58.9985881Z Entering 'third_party/FXdiv' 2024-08-06T20:47:59.0038003Z Entering 'third_party/NNPACK' 2024-08-06T20:47:59.0089849Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-06T20:47:59.0141171Z Entering 'third_party/XNNPACK' 2024-08-06T20:47:59.0209612Z Entering 'third_party/benchmark' 2024-08-06T20:47:59.0260901Z Entering 'third_party/cpp-httplib' 2024-08-06T20:47:59.0312971Z Entering 'third_party/cpuinfo' 2024-08-06T20:47:59.0364882Z Entering 'third_party/cudnn_frontend' 2024-08-06T20:47:59.0424847Z Entering 'third_party/cutlass' 2024-08-06T20:47:59.0484499Z Entering 'third_party/eigen' 2024-08-06T20:47:59.0541062Z Entering 'third_party/fbgemm' 2024-08-06T20:47:59.0592428Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-06T20:47:59.0642978Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T20:47:59.0693449Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-06T20:47:59.0750446Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-06T20:47:59.0801429Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T20:47:59.0854838Z Entering 'third_party/flatbuffers' 2024-08-06T20:47:59.0910114Z Entering 'third_party/fmt' 2024-08-06T20:47:59.0961739Z Entering 'third_party/foxi' 2024-08-06T20:47:59.1013852Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-06T20:47:59.1066045Z Entering 'third_party/gloo' 2024-08-06T20:47:59.1118746Z Entering 'third_party/googletest' 2024-08-06T20:47:59.1170975Z Entering 'third_party/ideep' 2024-08-06T20:47:59.1223932Z Entering 'third_party/ideep/mkl-dnn' 2024-08-06T20:47:59.1282751Z Entering 'third_party/ittapi' 2024-08-06T20:47:59.1335249Z Entering 'third_party/kineto' 2024-08-06T20:47:59.1385827Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T20:47:59.1441701Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T20:47:59.1496735Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T20:47:59.1547647Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T20:47:59.1599126Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T20:47:59.1648097Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T20:47:59.1704288Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T20:47:59.1755088Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T20:47:59.1806980Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T20:47:59.1858319Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T20:47:59.1912298Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T20:47:59.1962631Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T20:47:59.2019739Z Entering 'third_party/mimalloc' 2024-08-06T20:47:59.2071648Z Entering 'third_party/nccl/nccl' 2024-08-06T20:47:59.2124001Z Entering 'third_party/nlohmann' 2024-08-06T20:47:59.2176437Z Entering 'third_party/onnx' 2024-08-06T20:47:59.2242353Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-06T20:47:59.2292976Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-06T20:47:59.2350139Z Entering 'third_party/opentelemetry-cpp' 2024-08-06T20:47:59.2402943Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T20:47:59.2453243Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T20:47:59.2503676Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T20:47:59.2553052Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T20:47:59.2604771Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T20:47:59.2654396Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T20:47:59.2705132Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T20:47:59.2753992Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T20:47:59.2808439Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T20:47:59.2861946Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T20:47:59.2935711Z Entering 'third_party/pocketfft' 2024-08-06T20:47:59.2987186Z Entering 'third_party/protobuf' 2024-08-06T20:47:59.3043317Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-06T20:47:59.3097931Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-06T20:47:59.3151404Z Entering 'third_party/psimd' 2024-08-06T20:47:59.3204516Z Entering 'third_party/pthreadpool' 2024-08-06T20:47:59.3256356Z Entering 'third_party/pybind11' 2024-08-06T20:47:59.3308303Z Entering 'third_party/python-peachpy' 2024-08-06T20:47:59.3360572Z Entering 'third_party/sleef' 2024-08-06T20:47:59.3410906Z Entering 'third_party/tensorpipe' 2024-08-06T20:47:59.3463502Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-06T20:47:59.3516163Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-06T20:47:59.3566769Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-06T20:47:59.3620143Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T20:47:59.3668974Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T20:47:59.3744719Z ##[endgroup] 2024-08-06T20:47:59.3747121Z ##[group]Persisting credentials for submodules 2024-08-06T20:47:59.3750521Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2024-08-06T20:47:59.4118823Z Entering 'android/libs/fbjni' 2024-08-06T20:47:59.4185297Z Entering 'third_party/FP16' 2024-08-06T20:47:59.4253642Z Entering 'third_party/FXdiv' 2024-08-06T20:47:59.4323372Z Entering 'third_party/NNPACK' 2024-08-06T20:47:59.4390887Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-06T20:47:59.4459426Z Entering 'third_party/XNNPACK' 2024-08-06T20:47:59.4542612Z Entering 'third_party/benchmark' 2024-08-06T20:47:59.4610502Z Entering 'third_party/cpp-httplib' 2024-08-06T20:47:59.4677965Z Entering 'third_party/cpuinfo' 2024-08-06T20:47:59.4751968Z Entering 'third_party/cudnn_frontend' 2024-08-06T20:47:59.4820301Z Entering 'third_party/cutlass' 2024-08-06T20:47:59.4894857Z Entering 'third_party/eigen' 2024-08-06T20:47:59.4963643Z Entering 'third_party/fbgemm' 2024-08-06T20:47:59.5029364Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-06T20:47:59.5097246Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T20:47:59.5163840Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-06T20:47:59.5236781Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-06T20:47:59.5304593Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T20:47:59.5375112Z Entering 'third_party/flatbuffers' 2024-08-06T20:47:59.5446212Z Entering 'third_party/fmt' 2024-08-06T20:47:59.5516516Z Entering 'third_party/foxi' 2024-08-06T20:47:59.5583388Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-06T20:47:59.5652219Z Entering 'third_party/gloo' 2024-08-06T20:47:59.5723060Z Entering 'third_party/googletest' 2024-08-06T20:47:59.5791966Z Entering 'third_party/ideep' 2024-08-06T20:47:59.5859244Z Entering 'third_party/ideep/mkl-dnn' 2024-08-06T20:47:59.5935795Z Entering 'third_party/ittapi' 2024-08-06T20:47:59.6005086Z Entering 'third_party/kineto' 2024-08-06T20:47:59.6071599Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T20:47:59.6137711Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T20:47:59.6205136Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T20:47:59.6271941Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T20:47:59.6339564Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T20:47:59.6406483Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T20:47:59.6477785Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T20:47:59.6546150Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T20:47:59.6615068Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T20:47:59.6684487Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T20:47:59.6757634Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T20:47:59.6825460Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T20:47:59.6895748Z Entering 'third_party/mimalloc' 2024-08-06T20:47:59.6964465Z Entering 'third_party/nccl/nccl' 2024-08-06T20:47:59.7033564Z Entering 'third_party/nlohmann' 2024-08-06T20:47:59.7104695Z Entering 'third_party/onnx' 2024-08-06T20:47:59.7184103Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-06T20:47:59.7252681Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-06T20:47:59.7326882Z Entering 'third_party/opentelemetry-cpp' 2024-08-06T20:47:59.7400255Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T20:47:59.7466410Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T20:47:59.7533640Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T20:47:59.7600483Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T20:47:59.7667011Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T20:47:59.7732866Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T20:47:59.7800346Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T20:47:59.7864441Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T20:47:59.7934271Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T20:47:59.8002874Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T20:47:59.8090686Z Entering 'third_party/pocketfft' 2024-08-06T20:47:59.8157752Z Entering 'third_party/protobuf' 2024-08-06T20:47:59.8226806Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-06T20:47:59.8294451Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-06T20:47:59.8363349Z Entering 'third_party/psimd' 2024-08-06T20:47:59.8434038Z Entering 'third_party/pthreadpool' 2024-08-06T20:47:59.8502074Z Entering 'third_party/pybind11' 2024-08-06T20:47:59.8574008Z Entering 'third_party/python-peachpy' 2024-08-06T20:47:59.8642059Z Entering 'third_party/sleef' 2024-08-06T20:47:59.8709602Z Entering 'third_party/tensorpipe' 2024-08-06T20:47:59.8775259Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-06T20:47:59.8844453Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-06T20:47:59.8910567Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-06T20:47:59.8975498Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T20:47:59.9038708Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T20:47:59.9132706Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2024-08-06T20:47:59.9500823Z Entering 'android/libs/fbjni' 2024-08-06T20:47:59.9568119Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2024-08-06T20:47:59.9590489Z Entering 'third_party/FP16' 2024-08-06T20:47:59.9651630Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2024-08-06T20:47:59.9673543Z Entering 'third_party/FXdiv' 2024-08-06T20:47:59.9736193Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2024-08-06T20:47:59.9757948Z Entering 'third_party/NNPACK' 2024-08-06T20:47:59.9819474Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2024-08-06T20:47:59.9841620Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-06T20:47:59.9904163Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2024-08-06T20:47:59.9925849Z Entering 'third_party/XNNPACK' 2024-08-06T20:47:59.9987196Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2024-08-06T20:48:00.0025542Z Entering 'third_party/benchmark' 2024-08-06T20:48:00.0086690Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2024-08-06T20:48:00.0109013Z Entering 'third_party/cpp-httplib' 2024-08-06T20:48:00.0171485Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2024-08-06T20:48:00.0193318Z Entering 'third_party/cpuinfo' 2024-08-06T20:48:00.0254012Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2024-08-06T20:48:00.0276709Z Entering 'third_party/cudnn_frontend' 2024-08-06T20:48:00.0339644Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2024-08-06T20:48:00.0361570Z Entering 'third_party/cutlass' 2024-08-06T20:48:00.0423152Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2024-08-06T20:48:00.0452809Z Entering 'third_party/eigen' 2024-08-06T20:48:00.0515053Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2024-08-06T20:48:00.0538787Z Entering 'third_party/fbgemm' 2024-08-06T20:48:00.0605634Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2024-08-06T20:48:00.0626453Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-06T20:48:00.0687927Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2024-08-06T20:48:00.0709229Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T20:48:00.0771721Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2024-08-06T20:48:00.0792283Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-06T20:48:00.0855099Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cutlass/config remote.origin.url 2024-08-06T20:48:00.0881660Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-06T20:48:00.0944251Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2024-08-06T20:48:00.0965037Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T20:48:00.1027046Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2024-08-06T20:48:00.1050856Z Entering 'third_party/flatbuffers' 2024-08-06T20:48:00.1113321Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2024-08-06T20:48:00.1137854Z Entering 'third_party/fmt' 2024-08-06T20:48:00.1204236Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2024-08-06T20:48:00.1226267Z Entering 'third_party/foxi' 2024-08-06T20:48:00.1286773Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/foxi/config remote.origin.url 2024-08-06T20:48:00.1308833Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-06T20:48:00.1371764Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2024-08-06T20:48:00.1394413Z Entering 'third_party/gloo' 2024-08-06T20:48:00.1456534Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2024-08-06T20:48:00.1478963Z Entering 'third_party/googletest' 2024-08-06T20:48:00.1542225Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2024-08-06T20:48:00.1565592Z Entering 'third_party/ideep' 2024-08-06T20:48:00.1631731Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2024-08-06T20:48:00.1650907Z Entering 'third_party/ideep/mkl-dnn' 2024-08-06T20:48:00.1714448Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2024-08-06T20:48:00.1743544Z Entering 'third_party/ittapi' 2024-08-06T20:48:00.1805760Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2024-08-06T20:48:00.1827695Z Entering 'third_party/kineto' 2024-08-06T20:48:00.1889715Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2024-08-06T20:48:00.1910171Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T20:48:00.1973042Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2024-08-06T20:48:00.1991276Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T20:48:00.2053564Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2024-08-06T20:48:00.2077155Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T20:48:00.2139952Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2024-08-06T20:48:00.2161395Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T20:48:00.2228653Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2024-08-06T20:48:00.2250493Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T20:48:00.2313714Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2024-08-06T20:48:00.2333196Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T20:48:00.2395999Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2024-08-06T20:48:00.2419843Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T20:48:00.2483159Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2024-08-06T20:48:00.2505215Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T20:48:00.2567363Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2024-08-06T20:48:00.2589025Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T20:48:00.2652019Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2024-08-06T20:48:00.2675837Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T20:48:00.2738575Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2024-08-06T20:48:00.2765719Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T20:48:00.2826093Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2024-08-06T20:48:00.2846820Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T20:48:00.2908531Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2024-08-06T20:48:00.2932451Z Entering 'third_party/mimalloc' 2024-08-06T20:48:00.2996802Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2024-08-06T20:48:00.3018827Z Entering 'third_party/nccl/nccl' 2024-08-06T20:48:00.3084873Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2024-08-06T20:48:00.3107176Z Entering 'third_party/nlohmann' 2024-08-06T20:48:00.3168051Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2024-08-06T20:48:00.3191223Z Entering 'third_party/onnx' 2024-08-06T20:48:00.3254955Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2024-08-06T20:48:00.3290151Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-06T20:48:00.3353847Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2024-08-06T20:48:00.3375522Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-06T20:48:00.3439209Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2024-08-06T20:48:00.3465581Z Entering 'third_party/opentelemetry-cpp' 2024-08-06T20:48:00.3527946Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2024-08-06T20:48:00.3550686Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T20:48:00.3612914Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2024-08-06T20:48:00.3633781Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T20:48:00.3696021Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2024-08-06T20:48:00.3717587Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T20:48:00.3780771Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2024-08-06T20:48:00.3803250Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T20:48:00.3865371Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2024-08-06T20:48:00.3889502Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T20:48:00.3956049Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2024-08-06T20:48:00.3977123Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T20:48:00.4038838Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2024-08-06T20:48:00.4059713Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T20:48:00.4126836Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2024-08-06T20:48:00.4145903Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T20:48:00.4207508Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2024-08-06T20:48:00.4230333Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T20:48:00.4292548Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2024-08-06T20:48:00.4317118Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T20:48:00.4378218Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2024-08-06T20:48:00.4422594Z Entering 'third_party/pocketfft' 2024-08-06T20:48:00.4483924Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2024-08-06T20:48:00.4507419Z Entering 'third_party/protobuf' 2024-08-06T20:48:00.4569105Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2024-08-06T20:48:00.4593320Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-06T20:48:00.4655500Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2024-08-06T20:48:00.4676670Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-06T20:48:00.4737813Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2024-08-06T20:48:00.4762663Z Entering 'third_party/psimd' 2024-08-06T20:48:00.4827261Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2024-08-06T20:48:00.4848830Z Entering 'third_party/pthreadpool' 2024-08-06T20:48:00.4913000Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2024-08-06T20:48:00.4934495Z Entering 'third_party/pybind11' 2024-08-06T20:48:00.4996899Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2024-08-06T20:48:00.5018973Z Entering 'third_party/python-peachpy' 2024-08-06T20:48:00.5080947Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2024-08-06T20:48:00.5102955Z Entering 'third_party/sleef' 2024-08-06T20:48:00.5165705Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2024-08-06T20:48:00.5186787Z Entering 'third_party/tensorpipe' 2024-08-06T20:48:00.5249737Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2024-08-06T20:48:00.5269650Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-06T20:48:00.5333747Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2024-08-06T20:48:00.5353180Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-06T20:48:00.5414988Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2024-08-06T20:48:00.5435794Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-06T20:48:00.5498474Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2024-08-06T20:48:00.5518025Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T20:48:00.5579384Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2024-08-06T20:48:00.5599600Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T20:48:00.5662422Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2024-08-06T20:48:00.6680895Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2024-08-06T20:48:00.7055535Z Entering 'android/libs/fbjni' 2024-08-06T20:48:00.7107749Z Entering 'third_party/FP16' 2024-08-06T20:48:00.7160908Z Entering 'third_party/FXdiv' 2024-08-06T20:48:00.7213602Z Entering 'third_party/NNPACK' 2024-08-06T20:48:00.7266932Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-06T20:48:00.7321104Z Entering 'third_party/XNNPACK' 2024-08-06T20:48:00.7390454Z Entering 'third_party/benchmark' 2024-08-06T20:48:00.7443760Z Entering 'third_party/cpp-httplib' 2024-08-06T20:48:00.7497252Z Entering 'third_party/cpuinfo' 2024-08-06T20:48:00.7551030Z Entering 'third_party/cudnn_frontend' 2024-08-06T20:48:00.7604677Z Entering 'third_party/cutlass' 2024-08-06T20:48:00.7664413Z Entering 'third_party/eigen' 2024-08-06T20:48:00.7723139Z Entering 'third_party/fbgemm' 2024-08-06T20:48:00.7777142Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-06T20:48:00.7830538Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T20:48:00.7882082Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-06T20:48:00.7941702Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-06T20:48:00.7991552Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T20:48:00.8044633Z Entering 'third_party/flatbuffers' 2024-08-06T20:48:00.8099323Z Entering 'third_party/fmt' 2024-08-06T20:48:00.8150553Z Entering 'third_party/foxi' 2024-08-06T20:48:00.8201989Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-06T20:48:00.8253852Z Entering 'third_party/gloo' 2024-08-06T20:48:00.8305724Z Entering 'third_party/googletest' 2024-08-06T20:48:00.8357192Z Entering 'third_party/ideep' 2024-08-06T20:48:00.8407113Z Entering 'third_party/ideep/mkl-dnn' 2024-08-06T20:48:00.8471710Z Entering 'third_party/ittapi' 2024-08-06T20:48:00.8525222Z Entering 'third_party/kineto' 2024-08-06T20:48:00.8575690Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T20:48:00.8625776Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T20:48:00.8678980Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T20:48:00.8730531Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T20:48:00.8781965Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T20:48:00.8831798Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T20:48:00.8890260Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T20:48:00.8941800Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T20:48:00.8993492Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T20:48:00.9049412Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T20:48:00.9104136Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T20:48:00.9154994Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T20:48:00.9209189Z Entering 'third_party/mimalloc' 2024-08-06T20:48:00.9261864Z Entering 'third_party/nccl/nccl' 2024-08-06T20:48:00.9314919Z Entering 'third_party/nlohmann' 2024-08-06T20:48:00.9368865Z Entering 'third_party/onnx' 2024-08-06T20:48:00.9436906Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-06T20:48:00.9489552Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-06T20:48:00.9551869Z Entering 'third_party/opentelemetry-cpp' 2024-08-06T20:48:00.9607050Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T20:48:00.9657192Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T20:48:00.9707577Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T20:48:00.9757829Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T20:48:00.9809412Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T20:48:00.9860250Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T20:48:00.9911079Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T20:48:00.9960645Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T20:48:01.0014504Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T20:48:01.0068029Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T20:48:01.0140492Z Entering 'third_party/pocketfft' 2024-08-06T20:48:01.0193418Z Entering 'third_party/protobuf' 2024-08-06T20:48:01.0250525Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-06T20:48:01.0301802Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-06T20:48:01.0356067Z Entering 'third_party/psimd' 2024-08-06T20:48:01.0408851Z Entering 'third_party/pthreadpool' 2024-08-06T20:48:01.0461677Z Entering 'third_party/pybind11' 2024-08-06T20:48:01.0515380Z Entering 'third_party/python-peachpy' 2024-08-06T20:48:01.0567614Z Entering 'third_party/sleef' 2024-08-06T20:48:01.0622090Z Entering 'third_party/tensorpipe' 2024-08-06T20:48:01.0673632Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-06T20:48:01.0726571Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-06T20:48:01.0776501Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-06T20:48:01.0827749Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T20:48:01.0876799Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T20:48:01.0953386Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2024-08-06T20:48:01.1320553Z Entering 'android/libs/fbjni' 2024-08-06T20:48:01.1372046Z Entering 'third_party/FP16' 2024-08-06T20:48:01.1423811Z Entering 'third_party/FXdiv' 2024-08-06T20:48:01.1475636Z Entering 'third_party/NNPACK' 2024-08-06T20:48:01.1531972Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-06T20:48:01.1583449Z Entering 'third_party/XNNPACK' 2024-08-06T20:48:01.1650611Z Entering 'third_party/benchmark' 2024-08-06T20:48:01.1706887Z Entering 'third_party/cpp-httplib' 2024-08-06T20:48:01.1758255Z Entering 'third_party/cpuinfo' 2024-08-06T20:48:01.1809772Z Entering 'third_party/cudnn_frontend' 2024-08-06T20:48:01.1861488Z Entering 'third_party/cutlass' 2024-08-06T20:48:01.1919965Z Entering 'third_party/eigen' 2024-08-06T20:48:01.1973811Z Entering 'third_party/fbgemm' 2024-08-06T20:48:01.2025717Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-06T20:48:01.2078236Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T20:48:01.2133960Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-06T20:48:01.2190396Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-06T20:48:01.2246383Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T20:48:01.2301178Z Entering 'third_party/flatbuffers' 2024-08-06T20:48:01.2355652Z Entering 'third_party/fmt' 2024-08-06T20:48:01.2406901Z Entering 'third_party/foxi' 2024-08-06T20:48:01.2458317Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-06T20:48:01.2511276Z Entering 'third_party/gloo' 2024-08-06T20:48:01.2562194Z Entering 'third_party/googletest' 2024-08-06T20:48:01.2613940Z Entering 'third_party/ideep' 2024-08-06T20:48:01.2664432Z Entering 'third_party/ideep/mkl-dnn' 2024-08-06T20:48:01.2723766Z Entering 'third_party/ittapi' 2024-08-06T20:48:01.2775424Z Entering 'third_party/kineto' 2024-08-06T20:48:01.2828719Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T20:48:01.2879389Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T20:48:01.2933281Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T20:48:01.2988098Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T20:48:01.3040643Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T20:48:01.3095265Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T20:48:01.3153072Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T20:48:01.3207512Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T20:48:01.3263530Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T20:48:01.3317896Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T20:48:01.3372216Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T20:48:01.3423527Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T20:48:01.3476533Z Entering 'third_party/mimalloc' 2024-08-06T20:48:01.3531775Z Entering 'third_party/nccl/nccl' 2024-08-06T20:48:01.3583729Z Entering 'third_party/nlohmann' 2024-08-06T20:48:01.3663583Z Entering 'third_party/onnx' 2024-08-06T20:48:01.3704036Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-06T20:48:01.3755810Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-06T20:48:01.3812989Z Entering 'third_party/opentelemetry-cpp' 2024-08-06T20:48:01.3867576Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T20:48:01.3918782Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T20:48:01.3969064Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T20:48:01.4024805Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T20:48:01.4076062Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T20:48:01.4126433Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T20:48:01.4176596Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T20:48:01.4226388Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T20:48:01.4279098Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T20:48:01.4335155Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T20:48:01.4408310Z Entering 'third_party/pocketfft' 2024-08-06T20:48:01.4462643Z Entering 'third_party/protobuf' 2024-08-06T20:48:01.4516250Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-06T20:48:01.4566876Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-06T20:48:01.4621308Z Entering 'third_party/psimd' 2024-08-06T20:48:01.4673722Z Entering 'third_party/pthreadpool' 2024-08-06T20:48:01.4726631Z Entering 'third_party/pybind11' 2024-08-06T20:48:01.4778698Z Entering 'third_party/python-peachpy' 2024-08-06T20:48:01.4830527Z Entering 'third_party/sleef' 2024-08-06T20:48:01.4882636Z Entering 'third_party/tensorpipe' 2024-08-06T20:48:01.4933505Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-06T20:48:01.4984017Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-06T20:48:01.5033904Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-06T20:48:01.5083864Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T20:48:01.5134148Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T20:48:01.5209963Z ##[endgroup] 2024-08-06T20:48:01.5261438Z [command]/usr/bin/git log -1 --format='%H' 2024-08-06T20:48:01.5300985Z 'b9d86fa89636e301796d4201f36d86c73f6e49bc' 2024-08-06T20:48:01.5469630Z Prepare all required actions 2024-08-06T20:48:01.5470069Z Getting action download info 2024-08-06T20:48:01.6826172Z ##[group]Run ./.github/actions/setup-linux 2024-08-06T20:48:01.6826647Z env: 2024-08-06T20:48:01.6826846Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:01.6827169Z ##[endgroup] 2024-08-06T20:48:01.6876542Z ##[group]Run set -euo pipefail 2024-08-06T20:48:01.6876847Z set -euo pipefail 2024-08-06T20:48:01.6877106Z function get_ec2_metadata() { 2024-08-06T20:48:01.6877454Z  # Pulled from instance metadata endpoint for EC2 2024-08-06T20:48:01.6878024Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2024-08-06T20:48:01.6878521Z  category=$1 2024-08-06T20:48:01.6878847Z  # If it is GCP runner (runner name contains gcp), do not run this 2024-08-06T20:48:01.6879246Z  runner_name_str=i-0756c52faebfec338 2024-08-06T20:48:01.6879579Z  if [[ -f /.inarc ]]; then 2024-08-06T20:48:01.6879904Z  echo "ARC Runner, no info on ec2 metadata" 2024-08-06T20:48:01.6880252Z  elif [[ $runner_name_str == *"gcp"* ]]; then 2024-08-06T20:48:01.6880680Z  echo "Runner is from Google Cloud Platform, No info on ec2 metadata" 2024-08-06T20:48:01.6881068Z  else 2024-08-06T20:48:01.6881374Z  curl -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2024-08-06T20:48:01.6881741Z  fi 2024-08-06T20:48:01.6881939Z } 2024-08-06T20:48:01.6882171Z echo "ami-id: $(get_ec2_metadata ami-id)" 2024-08-06T20:48:01.6882554Z echo "instance-id: $(get_ec2_metadata instance-id)" 2024-08-06T20:48:01.6882980Z echo "instance-type: $(get_ec2_metadata instance-type)" 2024-08-06T20:48:01.6883348Z echo "system info $(uname -a)" 2024-08-06T20:48:01.6891943Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:01.6892671Z env: 2024-08-06T20:48:01.6892869Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:01.6893108Z ##[endgroup] 2024-08-06T20:48:01.6992322Z ami-id: ami-06c68f701d8090592 2024-08-06T20:48:01.7056992Z instance-id: i-0756c52faebfec338 2024-08-06T20:48:01.7114720Z instance-type: g5.4xlarge 2024-08-06T20:48:01.7128213Z system info Linux ip-10-0-40-3.ec2.internal 6.1.94-99.176.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Jun 18 14:57:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux 2024-08-06T20:48:01.7161424Z ##[group]Run echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> $GITHUB_OUTPUT 2024-08-06T20:48:01.7162004Z echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> $GITHUB_OUTPUT 2024-08-06T20:48:01.7170343Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:01.7170678Z env: 2024-08-06T20:48:01.7170868Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:01.7171094Z ##[endgroup] 2024-08-06T20:48:01.7266609Z ##[group]Run if systemctl is-active --quiet docker; then 2024-08-06T20:48:01.7267019Z if systemctl is-active --quiet docker; then 2024-08-06T20:48:01.7267357Z  echo "Docker daemon is running..."; 2024-08-06T20:48:01.7267647Z else 2024-08-06T20:48:01.7267974Z  echo "Starting docker deamon..." && sudo systemctl start docker; 2024-08-06T20:48:01.7268344Z fi 2024-08-06T20:48:01.7276209Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:01.7276588Z env: 2024-08-06T20:48:01.7276786Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:01.7277042Z ##[endgroup] 2024-08-06T20:48:01.7359801Z Docker daemon is running... 2024-08-06T20:48:01.7409466Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2024-08-06T20:48:01.7409842Z with: 2024-08-06T20:48:01.7410034Z shell: bash 2024-08-06T20:48:01.7410239Z timeout_minutes: 5 2024-08-06T20:48:01.7410453Z max_attempts: 3 2024-08-06T20:48:01.7410672Z retry_wait_seconds: 30 2024-08-06T20:48:01.7411827Z command: AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" 2024-08-06T20:48:01.7412941Z polling_interval_seconds: 1 2024-08-06T20:48:01.7413193Z warning_on_retry: true 2024-08-06T20:48:01.7413431Z continue_on_error: false 2024-08-06T20:48:01.7413660Z env: 2024-08-06T20:48:01.7413849Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:01.7414091Z AWS_RETRY_MODE: standard 2024-08-06T20:48:01.7414327Z AWS_MAX_ATTEMPTS: 5 2024-08-06T20:48:01.7414554Z AWS_DEFAULT_REGION: us-east-1 2024-08-06T20:48:01.7414923Z ##[endgroup] 2024-08-06T20:48:02.9272242Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2024-08-06T20:48:02.9272976Z Configure a credential helper to remove this warning. See 2024-08-06T20:48:02.9273609Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-08-06T20:48:02.9274081Z 2024-08-06T20:48:02.9274224Z Login Succeeded 2024-08-06T20:48:03.7995055Z Command completed after 1 attempt(s). 2024-08-06T20:48:03.8078527Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-08-06T20:48:03.8079008Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-08-06T20:48:03.8079405Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-08-06T20:48:03.8088206Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:03.8088546Z env: 2024-08-06T20:48:03.8088736Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:03.8088971Z ##[endgroup] 2024-08-06T20:48:03.8188437Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2024-08-06T20:48:03.8188939Z # ignore expansion of "docker ps -q" since it could be empty 2024-08-06T20:48:03.8189310Z # shellcheck disable=SC2046 2024-08-06T20:48:03.8189614Z docker stop $(docker ps -q) || true 2024-08-06T20:48:03.8189933Z # Prune all of the docker images 2024-08-06T20:48:03.8190221Z docker system prune -af 2024-08-06T20:48:03.8198921Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:03.8199266Z env: 2024-08-06T20:48:03.8199448Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:03.8199674Z ##[endgroup] 2024-08-06T20:48:03.8510649Z "docker stop" requires at least 1 argument. 2024-08-06T20:48:03.8511013Z See 'docker stop --help'. 2024-08-06T20:48:03.8511172Z 2024-08-06T20:48:03.8511324Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2024-08-06T20:48:03.8511578Z 2024-08-06T20:48:03.8511682Z Stop one or more running containers 2024-08-06T20:48:03.8706284Z Total reclaimed space: 0B 2024-08-06T20:48:03.8773390Z ##[group]Run set +e 2024-08-06T20:48:03.8773655Z set +e 2024-08-06T20:48:03.8773862Z set -x 2024-08-06T20:48:03.8774052Z  2024-08-06T20:48:03.8774276Z PT_DOMAIN=download.pytorch.org 2024-08-06T20:48:03.8774907Z # TODO: Flaky access to download.pytorch.org https://github.com/pytorch/pytorch/issues/100400, 2024-08-06T20:48:03.8775577Z # cleaning this up once the issue is fixed. There are more than one resolved IP here, the last 2024-08-06T20:48:03.8776050Z # one is returned at random 2024-08-06T20:48:03.8776403Z RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" | tail -n1) 2024-08-06T20:48:03.8776740Z  2024-08-06T20:48:03.8776947Z if [ -z "${RESOLVED_IP}" ]; then 2024-08-06T20:48:03.8777331Z  echo "Couldn't resolve ${PT_DOMAIN}, retrying with Google DNS..." 2024-08-06T20:48:03.8777797Z  RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" @8.8.8.8 | tail -n1) 2024-08-06T20:48:03.8778139Z  2024-08-06T20:48:03.8778356Z  if [ -z "${RESOLVED_IP}" ]; then 2024-08-06T20:48:03.8778705Z  echo "Couldn't resolve ${PT_DOMAIN}, exiting..." 2024-08-06T20:48:03.8779021Z  exit 1 2024-08-06T20:48:03.8779231Z  fi 2024-08-06T20:48:03.8779636Z fi 2024-08-06T20:48:03.8779829Z  2024-08-06T20:48:03.8780068Z if grep -r "${PT_DOMAIN}" /etc/hosts; then 2024-08-06T20:48:03.8780396Z  # Clean up any old records first 2024-08-06T20:48:03.8780879Z  sudo sed -i "/${PT_DOMAIN}/d" /etc/hosts 2024-08-06T20:48:03.8781175Z fi 2024-08-06T20:48:03.8781363Z  2024-08-06T20:48:03.8781635Z echo "${RESOLVED_IP} ${PT_DOMAIN}" | sudo tee -a /etc/hosts 2024-08-06T20:48:03.8781987Z cat /etc/hosts 2024-08-06T20:48:03.8790580Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:03.8790920Z env: 2024-08-06T20:48:03.8791123Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:03.8791356Z ##[endgroup] 2024-08-06T20:48:03.8819939Z + PT_DOMAIN=download.pytorch.org 2024-08-06T20:48:03.8826444Z ++ dig -4 +short download.pytorch.org 2024-08-06T20:48:03.8826857Z ++ tail -n1 2024-08-06T20:48:03.8978126Z + RESOLVED_IP=18.160.10.22 2024-08-06T20:48:03.8978429Z + '[' -z 18.160.10.22 ']' 2024-08-06T20:48:03.8978700Z + grep -r download.pytorch.org /etc/hosts 2024-08-06T20:48:03.8992990Z 18.160.10.36 download.pytorch.org 2024-08-06T20:48:03.8995254Z + sudo sed -i /download.pytorch.org/d /etc/hosts 2024-08-06T20:48:04.0490174Z + echo '18.160.10.22 download.pytorch.org' 2024-08-06T20:48:04.0490570Z + sudo tee -a /etc/hosts 2024-08-06T20:48:04.0816299Z 18.160.10.22 download.pytorch.org 2024-08-06T20:48:04.0837548Z + cat /etc/hosts 2024-08-06T20:48:04.0847645Z 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 2024-08-06T20:48:04.0854124Z ::1 localhost6 localhost6.localdomain6 2024-08-06T20:48:04.0854584Z 18.160.10.22 download.pytorch.org 2024-08-06T20:48:04.0977668Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2024-08-06T20:48:04.0978085Z with: 2024-08-06T20:48:04.0978747Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.0979490Z docker-build-dir: .ci/docker 2024-08-06T20:48:04.0979750Z working-directory: . 2024-08-06T20:48:04.0980065Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.0980436Z force-push: false 2024-08-06T20:48:04.0980642Z env: 2024-08-06T20:48:04.0980832Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:04.0981059Z ##[endgroup] 2024-08-06T20:48:04.1005572Z ##[group]Run set -ex 2024-08-06T20:48:04.1005841Z set -ex 2024-08-06T20:48:04.1006043Z  2024-08-06T20:48:04.1006395Z # If the docker build directory or the build script doesn't exist, the action will 2024-08-06T20:48:04.1007004Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2024-08-06T20:48:04.1007503Z # job could then download the pre-built image as usual 2024-08-06T20:48:04.1007960Z if [[ ! -d "${DOCKER_BUILD_DIR}" ]] || [[ ! -f "${DOCKER_BUILD_DIR}/build.sh" ]]; then 2024-08-06T20:48:04.1008376Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1008783Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1009138Z  2024-08-06T20:48:04.1009457Z  echo "There is no Docker build script in ${REPO_NAME} repo, skipping..." 2024-08-06T20:48:04.1009843Z  exit 0 2024-08-06T20:48:04.1010043Z else 2024-08-06T20:48:04.1010284Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1010580Z fi 2024-08-06T20:48:04.1010768Z  2024-08-06T20:48:04.1011071Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2024-08-06T20:48:04.1011600Z  # The docker image name already includes the ECR prefix and tag, so we can just 2024-08-06T20:48:04.1012063Z  # use it as it is, but first let's extract the tag 2024-08-06T20:48:04.1012491Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2024-08-06T20:48:04.1012941Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1013371Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1013898Z else 2024-08-06T20:48:04.1014178Z  DOCKER_TAG=$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2024-08-06T20:48:04.1014581Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1015282Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1015768Z fi 2024-08-06T20:48:04.1024889Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:04.1025230Z env: 2024-08-06T20:48:04.1025423Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:04.1025661Z REPO_NAME: pytorch 2024-08-06T20:48:04.1026342Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.1027071Z DOCKER_BUILD_DIR: .ci/docker 2024-08-06T20:48:04.1027398Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.1027757Z ##[endgroup] 2024-08-06T20:48:04.1055481Z + [[ ! -d .ci/docker ]] 2024-08-06T20:48:04.1055837Z + [[ ! -f .ci/docker/build.sh ]] 2024-08-06T20:48:04.1056109Z + echo skip=false 2024-08-06T20:48:04.1057139Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2024-08-06T20:48:04.1063748Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.1064452Z ++ awk -F '[:,]' '{print $2}' 2024-08-06T20:48:04.1088586Z + DOCKER_TAG=02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.1088998Z + echo docker-tag=02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.1089799Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.1121772Z ##[group]Run set +e 2024-08-06T20:48:04.1122041Z set +e 2024-08-06T20:48:04.1122243Z set -x 2024-08-06T20:48:04.1122444Z  2024-08-06T20:48:04.1122628Z login() { 2024-08-06T20:48:04.1123057Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2024-08-06T20:48:04.1123518Z } 2024-08-06T20:48:04.1123708Z  2024-08-06T20:48:04.1123904Z retry () { 2024-08-06T20:48:04.1124151Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2024-08-06T20:48:04.1124425Z } 2024-08-06T20:48:04.1124620Z  2024-08-06T20:48:04.1124839Z retry login "${DOCKER_REGISTRY}" 2024-08-06T20:48:04.1125106Z  2024-08-06T20:48:04.1125404Z # Check if image already exists, if it does then skip building it 2024-08-06T20:48:04.1125841Z if docker manifest inspect "${DOCKER_IMAGE}"; then 2024-08-06T20:48:04.1126171Z  exit 0 2024-08-06T20:48:04.1126364Z fi 2024-08-06T20:48:04.1126553Z  2024-08-06T20:48:04.1126858Z # NB: This part requires a full checkout. Otherwise, the merge base will 2024-08-06T20:48:04.1127364Z # be empty. The default action would be to continue rebuild the image 2024-08-06T20:48:04.1127816Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2024-08-06T20:48:04.1128226Z  # if we're on the base branch then use the parent commit 2024-08-06T20:48:04.1128574Z  MERGE_BASE=$(git rev-parse HEAD~) 2024-08-06T20:48:04.1128855Z else 2024-08-06T20:48:04.1129147Z  # otherwise we're on a PR, so use the most recent base commit 2024-08-06T20:48:04.1129562Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2024-08-06T20:48:04.1129881Z fi 2024-08-06T20:48:04.1130067Z  2024-08-06T20:48:04.1130271Z if [[ -z "${MERGE_BASE}" ]]; then 2024-08-06T20:48:04.1130747Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1131034Z  2024-08-06T20:48:04.1131436Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2024-08-06T20:48:04.1131916Z  exit 0 2024-08-06T20:48:04.1132131Z fi 2024-08-06T20:48:04.1132344Z  2024-08-06T20:48:04.1132631Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2024-08-06T20:48:04.1133237Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2024-08-06T20:48:04.1133740Z  exit 1 2024-08-06T20:48:04.1133940Z fi 2024-08-06T20:48:04.1134132Z  2024-08-06T20:48:04.1134445Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2024-08-06T20:48:04.1135135Z # If no image exists but the hash is the same as the previous hash then we should error out here 2024-08-06T20:48:04.1135662Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2024-08-06T20:48:04.1136257Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2024-08-06T20:48:04.1136920Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2024-08-06T20:48:04.1137325Z fi 2024-08-06T20:48:04.1137517Z  2024-08-06T20:48:04.1137890Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2024-08-06T20:48:04.1145714Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:04.1146051Z env: 2024-08-06T20:48:04.1146241Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:04.1146491Z DOCKER_BUILD_DIR: .ci/docker 2024-08-06T20:48:04.1146801Z BASE_REVISION: 1736af7cf736184c356be1bb00f59fb2feea6d7d 2024-08-06T20:48:04.1147567Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.1148336Z DOCKER_TAG: 02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.1148748Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.1149087Z ##[endgroup] 2024-08-06T20:48:04.1174610Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.1175081Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.1177995Z + aws ecr get-login-password --region us-east-1 2024-08-06T20:48:04.1179179Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.6510097Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2024-08-06T20:48:04.6510772Z Configure a credential helper to remove this warning. See 2024-08-06T20:48:04.6511467Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-08-06T20:48:04.6511920Z 2024-08-06T20:48:04.6512435Z Login Succeeded 2024-08-06T20:48:04.6536940Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.8734930Z { 2024-08-06T20:48:04.8735526Z "schemaVersion": 2, 2024-08-06T20:48:04.8736544Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2024-08-06T20:48:04.8736948Z "config": { 2024-08-06T20:48:04.8737259Z "mediaType": "application/vnd.docker.container.image.v1+json", 2024-08-06T20:48:04.8737634Z "size": 48439, 2024-08-06T20:48:04.8738005Z "digest": "sha256:6ec36276acd88c9be8b44d856744037d399b35f4bb1703e637c27ae2b254c901" 2024-08-06T20:48:04.8738430Z }, 2024-08-06T20:48:04.8738604Z "layers": [ 2024-08-06T20:48:04.8738792Z { 2024-08-06T20:48:04.8739085Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8739459Z "size": 28580681, 2024-08-06T20:48:04.8739843Z "digest": "sha256:7a2c559011895d255fce249c00396abff5ae7e0c0a92931d0ed493e71de78e3a" 2024-08-06T20:48:04.8740524Z }, 2024-08-06T20:48:04.8740695Z { 2024-08-06T20:48:04.8740994Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8741355Z "size": 7943451, 2024-08-06T20:48:04.8741858Z "digest": "sha256:224fe954d7252f10539d243d6c9688806f7d13ad775ed02e7f7c79077844510d" 2024-08-06T20:48:04.8742450Z }, 2024-08-06T20:48:04.8742677Z { 2024-08-06T20:48:04.8743089Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8743466Z "size": 55728572, 2024-08-06T20:48:04.8743835Z "digest": "sha256:75722010b82e31715876aeeed0b2cee414296f0124fdfa061ab845ba2a158450" 2024-08-06T20:48:04.8744252Z }, 2024-08-06T20:48:04.8744423Z { 2024-08-06T20:48:04.8744771Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8745141Z "size": 186, 2024-08-06T20:48:04.8745520Z "digest": "sha256:d527cbbb87e3016fd72a18a9b468c945ad0ca27c5770b39debd6ed704db3a195" 2024-08-06T20:48:04.8745945Z }, 2024-08-06T20:48:04.8746114Z { 2024-08-06T20:48:04.8746409Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8746772Z "size": 6886, 2024-08-06T20:48:04.8747139Z "digest": "sha256:b57676e46aee1a8c82e528d78e5a13e31142524eea31c8b213d69ddcb6f1fe80" 2024-08-06T20:48:04.8747559Z }, 2024-08-06T20:48:04.8747726Z { 2024-08-06T20:48:04.8748022Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8748584Z "size": 1329001756, 2024-08-06T20:48:04.8748981Z "digest": "sha256:a8c1e85b5e14cec7af70bf304cb4d4cee6a1d25eb8215b2cf4fdc33e5af5e108" 2024-08-06T20:48:04.8749409Z }, 2024-08-06T20:48:04.8749579Z { 2024-08-06T20:48:04.8749862Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8750228Z "size": 62501, 2024-08-06T20:48:04.8750595Z "digest": "sha256:a41a8d1c11c8d80fe4e82b0d05478f8d51176ff20b8350905fc1b25c93a51198" 2024-08-06T20:48:04.8751007Z }, 2024-08-06T20:48:04.8751181Z { 2024-08-06T20:48:04.8751473Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8751833Z "size": 1684, 2024-08-06T20:48:04.8752196Z "digest": "sha256:0c12278907551c2962927d27c115f6f7bf0df894318b8aea6ece3ef01ccd0a8a" 2024-08-06T20:48:04.8752602Z }, 2024-08-06T20:48:04.8752769Z { 2024-08-06T20:48:04.8753062Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8753422Z "size": 1523, 2024-08-06T20:48:04.8753804Z "digest": "sha256:d8d1234baab3ec9ccb8bb710fc6b8ff6c10896ba2e8d27a347583eca770f9ff1" 2024-08-06T20:48:04.8754229Z }, 2024-08-06T20:48:04.8754394Z { 2024-08-06T20:48:04.8754705Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8755069Z "size": 2528295403, 2024-08-06T20:48:04.8755448Z "digest": "sha256:7ed32bc8e4696fcdb2feef850781160597b2275ad756819c4add88236b0577d5" 2024-08-06T20:48:04.8755866Z }, 2024-08-06T20:48:04.8756029Z { 2024-08-06T20:48:04.8756330Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8756697Z "size": 86016, 2024-08-06T20:48:04.8757077Z "digest": "sha256:ec1e7978c1fe161ced1d98092a51e7c5953ca5fda5577f54df9dbda4afff1b2b" 2024-08-06T20:48:04.8757507Z }, 2024-08-06T20:48:04.8757668Z { 2024-08-06T20:48:04.8757960Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8758327Z "size": 1823, 2024-08-06T20:48:04.8758690Z "digest": "sha256:66b43372aa397c4303ca4e0e1122516909bca0c87b9b4bfb3972b8fd0c1d4390" 2024-08-06T20:48:04.8759103Z }, 2024-08-06T20:48:04.8759276Z { 2024-08-06T20:48:04.8759562Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8759930Z "size": 246768020, 2024-08-06T20:48:04.8760305Z "digest": "sha256:b6662193c745ec6b991e800e920c233379c7c0e74f2f64d9b82dd5dc4a27eb14" 2024-08-06T20:48:04.8760711Z }, 2024-08-06T20:48:04.8760876Z { 2024-08-06T20:48:04.8761170Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8761619Z "size": 545, 2024-08-06T20:48:04.8762000Z "digest": "sha256:5be2b638d110dd5ed631ce7ddf7eefa26b3abd49cf3ab845be5ecb3daec46b67" 2024-08-06T20:48:04.8762507Z }, 2024-08-06T20:48:04.8762753Z { 2024-08-06T20:48:04.8763180Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8763691Z "size": 1283, 2024-08-06T20:48:04.8764197Z "digest": "sha256:71ca63790839b9bfa870ee6927d5d7b60aaa1fc65b38d3e8fc42ace8911859ef" 2024-08-06T20:48:04.8764779Z }, 2024-08-06T20:48:04.8765012Z { 2024-08-06T20:48:04.8765401Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8765917Z "size": 484, 2024-08-06T20:48:04.8766425Z "digest": "sha256:8a74804dc4fa9ad5369e1ae6677a4e17bcc2c53d209a67738dbc795420066650" 2024-08-06T20:48:04.8766954Z }, 2024-08-06T20:48:04.8767123Z { 2024-08-06T20:48:04.8767422Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8767798Z "size": 91712377, 2024-08-06T20:48:04.8768176Z "digest": "sha256:3bacb5389b745ab1f7590db3db714e689a99ee0d7c709f907ccd6906f39905c5" 2024-08-06T20:48:04.8768597Z }, 2024-08-06T20:48:04.8768755Z { 2024-08-06T20:48:04.8769048Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8769417Z "size": 3231, 2024-08-06T20:48:04.8769775Z "digest": "sha256:a8911a72541a4ab35894015b7fb1174ea61c59fedc863dfa563324af5d6ae752" 2024-08-06T20:48:04.8770319Z }, 2024-08-06T20:48:04.8770488Z { 2024-08-06T20:48:04.8770786Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8771153Z "size": 1909, 2024-08-06T20:48:04.8771508Z "digest": "sha256:55d020986bb7c1702235b111c4b83d990fa63ce6045c5ac358a026832bbe8550" 2024-08-06T20:48:04.8771919Z }, 2024-08-06T20:48:04.8772087Z { 2024-08-06T20:48:04.8772385Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8772798Z "size": 700, 2024-08-06T20:48:04.8773172Z "digest": "sha256:679e209a81f89d0be588ce19c3f5191f73883a86e44ab7b3653a3be3f267b69e" 2024-08-06T20:48:04.8773588Z }, 2024-08-06T20:48:04.8773760Z { 2024-08-06T20:48:04.8774060Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8774426Z "size": 2785856582, 2024-08-06T20:48:04.8774981Z "digest": "sha256:d4fb7093f54f7f71e63223ef934b9ab258d53922a199ac4736897cdb90df0683" 2024-08-06T20:48:04.8775395Z }, 2024-08-06T20:48:04.8775564Z { 2024-08-06T20:48:04.8775859Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8776227Z "size": 381, 2024-08-06T20:48:04.8776589Z "digest": "sha256:0d8ab4023e81a9284aef759a1b3c759a907d0cbd39361f3ef0ce4f8c3994f882" 2024-08-06T20:48:04.8777006Z }, 2024-08-06T20:48:04.8777175Z { 2024-08-06T20:48:04.8777464Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8777828Z "size": 12876, 2024-08-06T20:48:04.8778203Z "digest": "sha256:bf191f5f5a0a370ba7136fa618cd8cb1eb76e5f82b8c5773a965cdd105515924" 2024-08-06T20:48:04.8778616Z }, 2024-08-06T20:48:04.8778782Z { 2024-08-06T20:48:04.8779071Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8779428Z "size": 803, 2024-08-06T20:48:04.8779793Z "digest": "sha256:14653e4e245feef24e0aabd8a4cd81c24298f800facc0299f797b161da696a1d" 2024-08-06T20:48:04.8780214Z }, 2024-08-06T20:48:04.8780377Z { 2024-08-06T20:48:04.8780671Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8781033Z "size": 106, 2024-08-06T20:48:04.8781393Z "digest": "sha256:8bdbb000c39dd99342429f8a1183bdb36f312b532ea7e47eb7719fea84c669f6" 2024-08-06T20:48:04.8781809Z }, 2024-08-06T20:48:04.8781978Z { 2024-08-06T20:48:04.8782264Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8782628Z "size": 504, 2024-08-06T20:48:04.8782986Z "digest": "sha256:277383b63c0797c1bd9e23c6f38d6ba85e6e321e2dc6b21fcd832f1935f5af87" 2024-08-06T20:48:04.8783537Z }, 2024-08-06T20:48:04.8783704Z { 2024-08-06T20:48:04.8783987Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8784354Z "size": 121477300, 2024-08-06T20:48:04.8784728Z "digest": "sha256:890313244493db7d65ed3f1cf91a94e6e50bbdb4df87b5bb829a1a3236ffaeb3" 2024-08-06T20:48:04.8785132Z }, 2024-08-06T20:48:04.8785298Z { 2024-08-06T20:48:04.8785594Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8785951Z "size": 109, 2024-08-06T20:48:04.8786325Z "digest": "sha256:f1e3cc0f57ee16caa6ffefa72c065dfe99a5d19a3a352342dfa26b63661589a2" 2024-08-06T20:48:04.8786750Z }, 2024-08-06T20:48:04.8786913Z { 2024-08-06T20:48:04.8787205Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8787571Z "size": 491, 2024-08-06T20:48:04.8787939Z "digest": "sha256:c3cbae3fe054ce8c713ed90c42a306ecc164d8256fd73a14ff7b0e088e150b3f" 2024-08-06T20:48:04.8788371Z }, 2024-08-06T20:48:04.8788542Z { 2024-08-06T20:48:04.8788832Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8789197Z "size": 296, 2024-08-06T20:48:04.8789577Z "digest": "sha256:ccc148c4e7590ced33e52f40edecd2d5ec73cb4a42c87dacaf5c5a7a3912c17b" 2024-08-06T20:48:04.8789998Z }, 2024-08-06T20:48:04.8790167Z { 2024-08-06T20:48:04.8790456Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8790910Z "size": 103, 2024-08-06T20:48:04.8791287Z "digest": "sha256:7912f8c8e80ddc0dfc068c1282e6bd0ffd098b02458818c2c7b52a89c41d8335" 2024-08-06T20:48:04.8791702Z }, 2024-08-06T20:48:04.8791866Z { 2024-08-06T20:48:04.8792541Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8792913Z "size": 1473, 2024-08-06T20:48:04.8793269Z "digest": "sha256:d166ebb28213d6d30940b4fb9739863e9200174e7b550a1591e9028b1a039f83" 2024-08-06T20:48:04.8793680Z }, 2024-08-06T20:48:04.8793853Z { 2024-08-06T20:48:04.8794137Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8794502Z "size": 424146463, 2024-08-06T20:48:04.8794879Z "digest": "sha256:63bf315f789a755602aeb163e43e8173bc191c3dabc75e39ab31d0762bacc84f" 2024-08-06T20:48:04.8795285Z }, 2024-08-06T20:48:04.8795451Z { 2024-08-06T20:48:04.8795735Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8796099Z "size": 159, 2024-08-06T20:48:04.8796461Z "digest": "sha256:bdb818f7b2c8404f3e19777a27592349798986185a1f5b539309bbe8ea96e513" 2024-08-06T20:48:04.8796863Z }, 2024-08-06T20:48:04.8797030Z { 2024-08-06T20:48:04.8797323Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8797681Z "size": 566, 2024-08-06T20:48:04.8798047Z "digest": "sha256:89d8aea05b3a5e45fc1c48daf5ac32901006f7804ce5f2104112c2a2136acf28" 2024-08-06T20:48:04.8798462Z }, 2024-08-06T20:48:04.8798629Z { 2024-08-06T20:48:04.8798923Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8799293Z "size": 35874371, 2024-08-06T20:48:04.8799660Z "digest": "sha256:f1122e19f79064bde97285bf17ca6d8abb889972e5d95a463ffd2382145c1f22" 2024-08-06T20:48:04.8800072Z }, 2024-08-06T20:48:04.8800241Z { 2024-08-06T20:48:04.8800524Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8800889Z "size": 104, 2024-08-06T20:48:04.8801250Z "digest": "sha256:13d6ce3185e9912952041a572e2efa85b4544ec540f6050f750093d180a069f6" 2024-08-06T20:48:04.8801654Z }, 2024-08-06T20:48:04.8801821Z { 2024-08-06T20:48:04.8802114Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8802491Z "size": 425, 2024-08-06T20:48:04.8802890Z "digest": "sha256:feb3f80c392d4aef71730a9673030e955ce0e8a5c41f350eb7a00592d6b0dbb3" 2024-08-06T20:48:04.8803309Z }, 2024-08-06T20:48:04.8803472Z { 2024-08-06T20:48:04.8803769Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8804317Z "size": 20262075, 2024-08-06T20:48:04.8804690Z "digest": "sha256:4fe4cdcdfbd890964b8270a9140a5bf255709a21af4401b0428d91a735e8ac12" 2024-08-06T20:48:04.8805109Z }, 2024-08-06T20:48:04.8805282Z { 2024-08-06T20:48:04.8805566Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8805931Z "size": 440, 2024-08-06T20:48:04.8806308Z "digest": "sha256:be10b99d8ac8cfa04842a726627b1bdc764d3b6f1c591dca7933b86c93208c66" 2024-08-06T20:48:04.8806720Z }, 2024-08-06T20:48:04.8806890Z { 2024-08-06T20:48:04.8807178Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8807548Z "size": 700, 2024-08-06T20:48:04.8807912Z "digest": "sha256:679e209a81f89d0be588ce19c3f5191f73883a86e44ab7b3653a3be3f267b69e" 2024-08-06T20:48:04.8808320Z }, 2024-08-06T20:48:04.8808491Z { 2024-08-06T20:48:04.8808784Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8809151Z "size": 143, 2024-08-06T20:48:04.8809518Z "digest": "sha256:5980a36dfe02695abaecfad21a248c6c1902b07b2c9b69c61c39e342994e2f91" 2024-08-06T20:48:04.8809940Z }, 2024-08-06T20:48:04.8810100Z { 2024-08-06T20:48:04.8810392Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8810759Z "size": 135, 2024-08-06T20:48:04.8811124Z "digest": "sha256:94a4e0b3f19a399451a5f3cc7ddbde73ea16a7f180f7f047bf3ad868072c173f" 2024-08-06T20:48:04.8811672Z }, 2024-08-06T20:48:04.8811841Z { 2024-08-06T20:48:04.8812125Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8812489Z "size": 32, 2024-08-06T20:48:04.8812860Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-06T20:48:04.8813273Z }, 2024-08-06T20:48:04.8813443Z { 2024-08-06T20:48:04.8813734Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8814095Z "size": 189, 2024-08-06T20:48:04.8814461Z "digest": "sha256:2012c603f15449503b4671093a9ba6aff4fc99cf4923a92bb446fde7e52d59ee" 2024-08-06T20:48:04.8814960Z }, 2024-08-06T20:48:04.8815124Z { 2024-08-06T20:48:04.8815416Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8815781Z "size": 563, 2024-08-06T20:48:04.8816139Z "digest": "sha256:060890aa9610c5ec0050f85cafaa1f010ff178e2c8b0600aa3c43ad37ed48976" 2024-08-06T20:48:04.8816554Z }, 2024-08-06T20:48:04.8816723Z { 2024-08-06T20:48:04.8817019Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8817382Z "size": 43163116, 2024-08-06T20:48:04.8817760Z "digest": "sha256:c1a64eb8ee12a08340fb5c5a87dc012ff3074a8b683cc399feaa431de7402abd" 2024-08-06T20:48:04.8818173Z }, 2024-08-06T20:48:04.8818345Z { 2024-08-06T20:48:04.8818633Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8819002Z "size": 106, 2024-08-06T20:48:04.8819438Z "digest": "sha256:ed7686d06f1d744c9ec6dd0d75ae1581baefd7809deef8aefa11d54945c7888f" 2024-08-06T20:48:04.8829930Z }, 2024-08-06T20:48:04.8830175Z { 2024-08-06T20:48:04.8830611Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8830995Z "size": 1212, 2024-08-06T20:48:04.8831361Z "digest": "sha256:5c40be0141236773ddf2a3127f247bcc22540d4bebf4f3cc1df53f16f629ee35" 2024-08-06T20:48:04.8831782Z }, 2024-08-06T20:48:04.8831953Z { 2024-08-06T20:48:04.8832247Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8832616Z "size": 700, 2024-08-06T20:48:04.8832996Z "digest": "sha256:679e209a81f89d0be588ce19c3f5191f73883a86e44ab7b3653a3be3f267b69e" 2024-08-06T20:48:04.8833404Z }, 2024-08-06T20:48:04.8833571Z { 2024-08-06T20:48:04.8833862Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8834223Z "size": 138, 2024-08-06T20:48:04.8834585Z "digest": "sha256:95c1963010edc97c994c12e530dc7e5a5717123dfc4378fe8ecca9dbf79de394" 2024-08-06T20:48:04.8835130Z }, 2024-08-06T20:48:04.8835291Z { 2024-08-06T20:48:04.8835581Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8835942Z "size": 120, 2024-08-06T20:48:04.8836287Z "digest": "sha256:5805001913689846871dcb66b59a8d496e3c78fbf4b46c0c55cb11629af04779" 2024-08-06T20:48:04.8836684Z }, 2024-08-06T20:48:04.8836848Z { 2024-08-06T20:48:04.8837134Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8837505Z "size": 1916657670, 2024-08-06T20:48:04.8837892Z "digest": "sha256:b826637ebc384c2f2efbdc841bf6b8f0ac9b6a85060cab5d171d8ed8d49dd3de" 2024-08-06T20:48:04.8838306Z }, 2024-08-06T20:48:04.8838469Z { 2024-08-06T20:48:04.8838756Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8839114Z "size": 173, 2024-08-06T20:48:04.8839461Z "digest": "sha256:859f9c7a63754c26422062903f2a9991578a62fe2f1a81c9ad0f0e9517ab7387" 2024-08-06T20:48:04.8839867Z }, 2024-08-06T20:48:04.8840028Z { 2024-08-06T20:48:04.8840313Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8840677Z "size": 908, 2024-08-06T20:48:04.8841030Z "digest": "sha256:b89ac1530c4a96d2c4c0626a5202eb9e9a05e0d08517e1a5bf165257505309e8" 2024-08-06T20:48:04.8841432Z }, 2024-08-06T20:48:04.8841591Z { 2024-08-06T20:48:04.8841873Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8842337Z "size": 700, 2024-08-06T20:48:04.8842745Z "digest": "sha256:679e209a81f89d0be588ce19c3f5191f73883a86e44ab7b3653a3be3f267b69e" 2024-08-06T20:48:04.8843149Z }, 2024-08-06T20:48:04.8843313Z { 2024-08-06T20:48:04.8843596Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8843957Z "size": 134, 2024-08-06T20:48:04.8844323Z "digest": "sha256:4f10deed2e003fe5f78af780a6b4a71d0107d3ee59e41de3f35c031ca08e9d4d" 2024-08-06T20:48:04.8844737Z }, 2024-08-06T20:48:04.8844908Z { 2024-08-06T20:48:04.8845198Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8845554Z "size": 32, 2024-08-06T20:48:04.8845920Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-06T20:48:04.8846335Z }, 2024-08-06T20:48:04.8846495Z { 2024-08-06T20:48:04.8846782Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8847143Z "size": 156, 2024-08-06T20:48:04.8847497Z "digest": "sha256:336420751f1de11d750328660fdb6ebb9051881d009d399e893c10d61ba69b0c" 2024-08-06T20:48:04.8847895Z }, 2024-08-06T20:48:04.8848056Z { 2024-08-06T20:48:04.8848336Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8848698Z "size": 1841, 2024-08-06T20:48:04.8849055Z "digest": "sha256:f7f49611427c9bdc74d97703f780519d1d7d2b95a5377f6f625c8884cbc21d4e" 2024-08-06T20:48:04.8849450Z }, 2024-08-06T20:48:04.8849611Z { 2024-08-06T20:48:04.8849897Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8850264Z "size": 7529783, 2024-08-06T20:48:04.8850625Z "digest": "sha256:628b460c253a663b1f76b99fd2f00d63872fce39b0830a3b45bdeec4f5244660" 2024-08-06T20:48:04.8851027Z }, 2024-08-06T20:48:04.8851183Z { 2024-08-06T20:48:04.8851466Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8851828Z "size": 164, 2024-08-06T20:48:04.8852185Z "digest": "sha256:98e88ff103238559de2c0c76e43d76a01b94584edee356532b7723d1fd39dd85" 2024-08-06T20:48:04.8852585Z }, 2024-08-06T20:48:04.8852748Z { 2024-08-06T20:48:04.8853031Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8853387Z "size": 7944, 2024-08-06T20:48:04.8853749Z "digest": "sha256:6abf825f7962d4bc769dde6a63a4132694ecb9ba0f17006085d8c339aeedf887" 2024-08-06T20:48:04.8854155Z }, 2024-08-06T20:48:04.8854321Z { 2024-08-06T20:48:04.8854602Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8855215Z "size": 8063, 2024-08-06T20:48:04.8855578Z "digest": "sha256:844414c41546bd3c4dd14a45bbd58cca4a2aa0e8f37a781f8c386736ae4d4081" 2024-08-06T20:48:04.8855989Z }, 2024-08-06T20:48:04.8856143Z { 2024-08-06T20:48:04.8856432Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8856789Z "size": 300, 2024-08-06T20:48:04.8857143Z "digest": "sha256:b92a0d83e22950e600ffd0f6391f5c20b499107ba973cab4d7a54a5c65a922b1" 2024-08-06T20:48:04.8857554Z }, 2024-08-06T20:48:04.8857714Z { 2024-08-06T20:48:04.8857997Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8858356Z "size": 7629841, 2024-08-06T20:48:04.8858709Z "digest": "sha256:56e4340bc9e3886f7c099a66772a040a2d34cf0782746af58b0317af979cdfa3" 2024-08-06T20:48:04.8859106Z }, 2024-08-06T20:48:04.8859266Z { 2024-08-06T20:48:04.8859552Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8859919Z "size": 108, 2024-08-06T20:48:04.8860265Z "digest": "sha256:26f48d882588278c8763af295a6bc7147c492d82c6e4395970856a29fb8d77f0" 2024-08-06T20:48:04.8860655Z }, 2024-08-06T20:48:04.8860818Z { 2024-08-06T20:48:04.8861103Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8861462Z "size": 54145778, 2024-08-06T20:48:04.8861830Z "digest": "sha256:b6fe2821ba25ab984577df156aed9b873699ef0f46b6230d8e9a54f9ee22be1e" 2024-08-06T20:48:04.8862241Z }, 2024-08-06T20:48:04.8862498Z { 2024-08-06T20:48:04.8862818Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8863205Z "size": 473, 2024-08-06T20:48:04.8863565Z "digest": "sha256:fae8722cca7f32933f7a25f1491c31ea9a6df4fc1f9fb2360bd29c79b04f1c56" 2024-08-06T20:48:04.8863979Z }, 2024-08-06T20:48:04.8864143Z { 2024-08-06T20:48:04.8864424Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8864788Z "size": 1374858912, 2024-08-06T20:48:04.8865169Z "digest": "sha256:3c7c25c582fced622823798bd877a7fb903ebd4bfecd93c32e43dbd536bb8202" 2024-08-06T20:48:04.8865576Z }, 2024-08-06T20:48:04.8865740Z { 2024-08-06T20:48:04.8866026Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8866384Z "size": 106, 2024-08-06T20:48:04.8866740Z "digest": "sha256:75a49c2f3f0a99be9760740cce745e1ffd508a15bf5ef08077b2032b4d4d97ce" 2024-08-06T20:48:04.8867153Z }, 2024-08-06T20:48:04.8867311Z { 2024-08-06T20:48:04.8867603Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8867960Z "size": 558, 2024-08-06T20:48:04.8868322Z "digest": "sha256:b32c97699ecde27b65bfbbd8ba207755eb28584f2fc64501f4a320045ae969c8" 2024-08-06T20:48:04.8868731Z }, 2024-08-06T20:48:04.8868886Z { 2024-08-06T20:48:04.8869174Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8869532Z "size": 46248557, 2024-08-06T20:48:04.8869885Z "digest": "sha256:b926a85168171349a0ff57c87aa52b9174d3704512eb2e687184f5552883312a" 2024-08-06T20:48:04.8870281Z }, 2024-08-06T20:48:04.8870441Z { 2024-08-06T20:48:04.8870723Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8871086Z "size": 111, 2024-08-06T20:48:04.8871438Z "digest": "sha256:1c5d35b9a7607fd72af03dc281fe78215973e63c51c1b823a704727c8a0944eb" 2024-08-06T20:48:04.8871838Z }, 2024-08-06T20:48:04.8871999Z { 2024-08-06T20:48:04.8872286Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8872643Z "size": 32, 2024-08-06T20:48:04.8873004Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-06T20:48:04.8873420Z }, 2024-08-06T20:48:04.8873578Z { 2024-08-06T20:48:04.8873863Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8874224Z "size": 32, 2024-08-06T20:48:04.8874578Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-06T20:48:04.8875079Z }, 2024-08-06T20:48:04.8875236Z { 2024-08-06T20:48:04.8875517Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8875876Z "size": 32, 2024-08-06T20:48:04.8876233Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-06T20:48:04.8876644Z }, 2024-08-06T20:48:04.8876809Z { 2024-08-06T20:48:04.8877093Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-06T20:48:04.8877455Z "size": 32, 2024-08-06T20:48:04.8877814Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-06T20:48:04.8878228Z } 2024-08-06T20:48:04.8878390Z ] 2024-08-06T20:48:04.8878548Z } 2024-08-06T20:48:04.8878720Z + exit 0 2024-08-06T20:48:04.8940197Z ##[group]Run tag=${ECR_DOCKER_IMAGE##*/} 2024-08-06T20:48:04.8940543Z tag=${ECR_DOCKER_IMAGE##*/} 2024-08-06T20:48:04.8940898Z echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}" 2024-08-06T20:48:04.8949492Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:04.8949819Z env: 2024-08-06T20:48:04.8950014Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:04.8950709Z ECR_DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.8951416Z ##[endgroup] 2024-08-06T20:48:04.8983939Z docker pull ghcr.io/pytorch/ci-image:pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.9035472Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2024-08-06T20:48:04.9035884Z with: 2024-08-06T20:48:04.9036522Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.9037323Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.9037667Z env: 2024-08-06T20:48:04.9037867Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:04.9038116Z ##[endgroup] 2024-08-06T20:48:04.9061304Z ##[group]Run set -x 2024-08-06T20:48:04.9061557Z set -x 2024-08-06T20:48:04.9061751Z set +e 2024-08-06T20:48:04.9061946Z  2024-08-06T20:48:04.9062138Z login() { 2024-08-06T20:48:04.9062560Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2024-08-06T20:48:04.9063019Z } 2024-08-06T20:48:04.9063210Z  2024-08-06T20:48:04.9063425Z retry () { 2024-08-06T20:48:04.9063684Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2024-08-06T20:48:04.9064000Z } 2024-08-06T20:48:04.9064194Z  2024-08-06T20:48:04.9064414Z retry login "${DOCKER_REGISTRY}" 2024-08-06T20:48:04.9064722Z  2024-08-06T20:48:04.9064922Z set -e 2024-08-06T20:48:04.9065258Z # ignore output since only exit code is used for conditional 2024-08-06T20:48:04.9065771Z # only pull docker image if it's not available locally 2024-08-06T20:48:04.9066337Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2024-08-06T20:48:04.9066856Z  retry docker pull "${DOCKER_IMAGE}" 2024-08-06T20:48:04.9067143Z fi 2024-08-06T20:48:04.9075666Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:48:04.9075993Z env: 2024-08-06T20:48:04.9076189Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:48:04.9076880Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:04.9077661Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.9078008Z ##[endgroup] 2024-08-06T20:48:04.9106654Z + set +e 2024-08-06T20:48:04.9107018Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.9107401Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:04.9110346Z + aws ecr get-login-password --region us-east-1 2024-08-06T20:48:04.9112116Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-06T20:48:05.4438180Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2024-08-06T20:48:05.4438803Z Configure a credential helper to remove this warning. See 2024-08-06T20:48:05.4439315Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-08-06T20:48:05.4439665Z 2024-08-06T20:48:05.4440156Z Login Succeeded 2024-08-06T20:48:05.4463871Z + set -e 2024-08-06T20:48:05.4464762Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:05.4624355Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:05.4625537Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:48:05.7321091Z 02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9: Pulling from pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9 2024-08-06T20:48:05.7322023Z 7a2c55901189: Pulling fs layer 2024-08-06T20:48:05.7322396Z 224fe954d725: Pulling fs layer 2024-08-06T20:48:05.7322758Z 75722010b82e: Pulling fs layer 2024-08-06T20:48:05.7323167Z d527cbbb87e3: Pulling fs layer 2024-08-06T20:48:05.7323954Z b57676e46aee: Pulling fs layer 2024-08-06T20:48:05.7324361Z a8c1e85b5e14: Pulling fs layer 2024-08-06T20:48:05.7324731Z a41a8d1c11c8: Pulling fs layer 2024-08-06T20:48:05.7325116Z 0c1227890755: Pulling fs layer 2024-08-06T20:48:05.7325479Z d8d1234baab3: Pulling fs layer 2024-08-06T20:48:05.7325852Z 7ed32bc8e469: Pulling fs layer 2024-08-06T20:48:05.7326165Z ec1e7978c1fe: Pulling fs layer 2024-08-06T20:48:05.7326426Z 66b43372aa39: Pulling fs layer 2024-08-06T20:48:05.7326790Z b6662193c745: Pulling fs layer 2024-08-06T20:48:05.7327044Z 5be2b638d110: Pulling fs layer 2024-08-06T20:48:05.7327323Z 71ca63790839: Pulling fs layer 2024-08-06T20:48:05.7327622Z a8c1e85b5e14: Waiting 2024-08-06T20:48:05.7327887Z 8a74804dc4fa: Pulling fs layer 2024-08-06T20:48:05.7328249Z 3bacb5389b74: Pulling fs layer 2024-08-06T20:48:05.7328552Z d527cbbb87e3: Waiting 2024-08-06T20:48:05.7328782Z a8911a72541a: Pulling fs layer 2024-08-06T20:48:05.7329136Z b57676e46aee: Waiting 2024-08-06T20:48:05.7329454Z 0c1227890755: Waiting 2024-08-06T20:48:05.7329705Z 55d020986bb7: Pulling fs layer 2024-08-06T20:48:05.7329948Z a41a8d1c11c8: Waiting 2024-08-06T20:48:05.7330176Z 679e209a81f8: Pulling fs layer 2024-08-06T20:48:05.7330435Z d4fb7093f54f: Pulling fs layer 2024-08-06T20:48:05.7330673Z ec1e7978c1fe: Waiting 2024-08-06T20:48:05.7330901Z 0d8ab4023e81: Pulling fs layer 2024-08-06T20:48:05.7331154Z bf191f5f5a0a: Pulling fs layer 2024-08-06T20:48:05.7331397Z 66b43372aa39: Waiting 2024-08-06T20:48:05.7331654Z b6662193c745: Waiting 2024-08-06T20:48:05.7331875Z 14653e4e245f: Pulling fs layer 2024-08-06T20:48:05.7332111Z d8d1234baab3: Waiting 2024-08-06T20:48:05.7332345Z 8bdbb000c39d: Pulling fs layer 2024-08-06T20:48:05.7332589Z 71ca63790839: Waiting 2024-08-06T20:48:05.7332801Z 277383b63c07: Pulling fs layer 2024-08-06T20:48:05.7333042Z 679e209a81f8: Waiting 2024-08-06T20:48:05.7333260Z 5be2b638d110: Waiting 2024-08-06T20:48:05.7333474Z 890313244493: Pulling fs layer 2024-08-06T20:48:05.7333715Z 3bacb5389b74: Waiting 2024-08-06T20:48:05.7333977Z 0d8ab4023e81: Waiting 2024-08-06T20:48:05.7334255Z f1e3cc0f57ee: Pulling fs layer 2024-08-06T20:48:05.7334569Z 55d020986bb7: Waiting 2024-08-06T20:48:05.7334928Z bf191f5f5a0a: Waiting 2024-08-06T20:48:05.7335142Z 277383b63c07: Waiting 2024-08-06T20:48:05.7335363Z 890313244493: Waiting 2024-08-06T20:48:05.7335594Z c3cbae3fe054: Pulling fs layer 2024-08-06T20:48:05.7335861Z ccc148c4e759: Pulling fs layer 2024-08-06T20:48:05.7336123Z 7912f8c8e80d: Pulling fs layer 2024-08-06T20:48:05.7336579Z d166ebb28213: Pulling fs layer 2024-08-06T20:48:05.7336829Z c3cbae3fe054: Waiting 2024-08-06T20:48:05.7337065Z 63bf315f789a: Pulling fs layer 2024-08-06T20:48:05.7337304Z ccc148c4e759: Waiting 2024-08-06T20:48:05.7337536Z bdb818f7b2c8: Pulling fs layer 2024-08-06T20:48:05.7337784Z 7912f8c8e80d: Waiting 2024-08-06T20:48:05.7338010Z 89d8aea05b3a: Pulling fs layer 2024-08-06T20:48:05.7338274Z f1122e19f790: Pulling fs layer 2024-08-06T20:48:05.7338530Z 13d6ce3185e9: Pulling fs layer 2024-08-06T20:48:05.7338779Z bdb818f7b2c8: Waiting 2024-08-06T20:48:05.7339013Z feb3f80c392d: Pulling fs layer 2024-08-06T20:48:05.7339263Z 89d8aea05b3a: Waiting 2024-08-06T20:48:05.7339489Z 4fe4cdcdfbd8: Pulling fs layer 2024-08-06T20:48:05.7339748Z be10b99d8ac8: Pulling fs layer 2024-08-06T20:48:05.7339995Z 13d6ce3185e9: Waiting 2024-08-06T20:48:05.7340206Z 63bf315f789a: Waiting 2024-08-06T20:48:05.7340436Z 5980a36dfe02: Pulling fs layer 2024-08-06T20:48:05.7340681Z feb3f80c392d: Waiting 2024-08-06T20:48:05.7340905Z 4fe4cdcdfbd8: Waiting 2024-08-06T20:48:05.7341125Z a8911a72541a: Waiting 2024-08-06T20:48:05.7341347Z 94a4e0b3f19a: Pulling fs layer 2024-08-06T20:48:05.7341606Z 4f4fb700ef54: Pulling fs layer 2024-08-06T20:48:05.7341866Z 5980a36dfe02: Waiting 2024-08-06T20:48:05.7342084Z 94a4e0b3f19a: Waiting 2024-08-06T20:48:05.7342311Z 2012c603f154: Pulling fs layer 2024-08-06T20:48:05.7342566Z 060890aa9610: Pulling fs layer 2024-08-06T20:48:05.7342815Z c1a64eb8ee12: Pulling fs layer 2024-08-06T20:48:05.7343080Z ed7686d06f1d: Pulling fs layer 2024-08-06T20:48:05.7343526Z 2012c603f154: Waiting 2024-08-06T20:48:05.7343870Z 5c40be014123: Pulling fs layer 2024-08-06T20:48:05.7344237Z 060890aa9610: Waiting 2024-08-06T20:48:05.7344554Z 95c1963010ed: Pulling fs layer 2024-08-06T20:48:05.7344957Z ed7686d06f1d: Waiting 2024-08-06T20:48:05.7345234Z 580500191368: Pulling fs layer 2024-08-06T20:48:05.7345491Z b826637ebc38: Pulling fs layer 2024-08-06T20:48:05.7345732Z 5c40be014123: Waiting 2024-08-06T20:48:05.7345971Z 859f9c7a6375: Pulling fs layer 2024-08-06T20:48:05.7346228Z b89ac1530c4a: Pulling fs layer 2024-08-06T20:48:05.7346476Z c1a64eb8ee12: Waiting 2024-08-06T20:48:05.7346695Z b826637ebc38: Waiting 2024-08-06T20:48:05.7346906Z b89ac1530c4a: Waiting 2024-08-06T20:48:05.7347135Z 4f10deed2e00: Pulling fs layer 2024-08-06T20:48:05.7347380Z 580500191368: Waiting 2024-08-06T20:48:05.7347589Z 4f10deed2e00: Waiting 2024-08-06T20:48:05.7347814Z 336420751f1d: Pulling fs layer 2024-08-06T20:48:05.7348066Z f7f49611427c: Pulling fs layer 2024-08-06T20:48:05.7348323Z 628b460c253a: Pulling fs layer 2024-08-06T20:48:05.7348576Z 98e88ff10323: Pulling fs layer 2024-08-06T20:48:05.7348816Z f7f49611427c: Waiting 2024-08-06T20:48:05.7349037Z 6abf825f7962: Pulling fs layer 2024-08-06T20:48:05.7349290Z 844414c41546: Pulling fs layer 2024-08-06T20:48:05.7349545Z b92a0d83e229: Pulling fs layer 2024-08-06T20:48:05.7349794Z 56e4340bc9e3: Pulling fs layer 2024-08-06T20:48:05.7350048Z 26f48d882588: Pulling fs layer 2024-08-06T20:48:05.7350293Z 6abf825f7962: Waiting 2024-08-06T20:48:05.7350501Z 844414c41546: Waiting 2024-08-06T20:48:05.7350728Z b6fe2821ba25: Pulling fs layer 2024-08-06T20:48:05.7350994Z fae8722cca7f: Pulling fs layer 2024-08-06T20:48:05.7351244Z 3c7c25c582fc: Pulling fs layer 2024-08-06T20:48:05.7351489Z 26f48d882588: Waiting 2024-08-06T20:48:05.7351786Z 75a49c2f3f0a: Pulling fs layer 2024-08-06T20:48:05.7352031Z b92a0d83e229: Waiting 2024-08-06T20:48:05.7352260Z b32c97699ecd: Pulling fs layer 2024-08-06T20:48:05.7352505Z b926a8516817: Pulling fs layer 2024-08-06T20:48:05.7352752Z 75a49c2f3f0a: Waiting 2024-08-06T20:48:05.7352971Z b32c97699ecd: Waiting 2024-08-06T20:48:05.7353194Z 1c5d35b9a760: Pulling fs layer 2024-08-06T20:48:05.7353438Z 98e88ff10323: Waiting 2024-08-06T20:48:05.7353657Z fae8722cca7f: Waiting 2024-08-06T20:48:05.7353869Z 1c5d35b9a760: Waiting 2024-08-06T20:48:05.8737498Z 224fe954d725: Verifying Checksum 2024-08-06T20:48:05.8737827Z 224fe954d725: Download complete 2024-08-06T20:48:05.9503062Z d527cbbb87e3: Verifying Checksum 2024-08-06T20:48:05.9503704Z d527cbbb87e3: Download complete 2024-08-06T20:48:06.0209052Z b57676e46aee: Download complete 2024-08-06T20:48:06.0810801Z 7a2c55901189: Verifying Checksum 2024-08-06T20:48:06.0811103Z 7a2c55901189: Download complete 2024-08-06T20:48:06.1557309Z a41a8d1c11c8: Download complete 2024-08-06T20:48:06.2292367Z 0c1227890755: Download complete 2024-08-06T20:48:06.2979590Z d8d1234baab3: Download complete 2024-08-06T20:48:06.3526592Z 75722010b82e: Verifying Checksum 2024-08-06T20:48:06.3526999Z 75722010b82e: Download complete 2024-08-06T20:48:06.4990240Z ec1e7978c1fe: Verifying Checksum 2024-08-06T20:48:06.4990763Z ec1e7978c1fe: Download complete 2024-08-06T20:48:06.6001188Z 66b43372aa39: Verifying Checksum 2024-08-06T20:48:06.6001506Z 66b43372aa39: Download complete 2024-08-06T20:48:07.2461914Z 7a2c55901189: Pull complete 2024-08-06T20:48:07.5689698Z 224fe954d725: Pull complete 2024-08-06T20:48:08.5051398Z 75722010b82e: Pull complete 2024-08-06T20:48:08.5232612Z d527cbbb87e3: Pull complete 2024-08-06T20:48:08.5422809Z b57676e46aee: Pull complete 2024-08-06T20:48:09.1260656Z b6662193c745: Verifying Checksum 2024-08-06T20:48:09.1261008Z b6662193c745: Download complete 2024-08-06T20:48:09.2198875Z 5be2b638d110: Download complete 2024-08-06T20:48:09.2955111Z 71ca63790839: Verifying Checksum 2024-08-06T20:48:09.2955419Z 71ca63790839: Download complete 2024-08-06T20:48:09.4022454Z 8a74804dc4fa: Verifying Checksum 2024-08-06T20:48:09.4023055Z 8a74804dc4fa: Download complete 2024-08-06T20:48:10.3685932Z 3bacb5389b74: Verifying Checksum 2024-08-06T20:48:10.3686369Z 3bacb5389b74: Download complete 2024-08-06T20:48:10.4491016Z a8911a72541a: Verifying Checksum 2024-08-06T20:48:10.4491452Z a8911a72541a: Download complete 2024-08-06T20:48:10.5340905Z 55d020986bb7: Verifying Checksum 2024-08-06T20:48:10.5341262Z 55d020986bb7: Download complete 2024-08-06T20:48:10.6171407Z 679e209a81f8: Download complete 2024-08-06T20:48:19.3659503Z a8c1e85b5e14: Verifying Checksum 2024-08-06T20:48:19.3659868Z a8c1e85b5e14: Download complete 2024-08-06T20:48:19.5394390Z bf191f5f5a0a: Download complete 2024-08-06T20:48:19.6004178Z 14653e4e245f: Verifying Checksum 2024-08-06T20:48:19.6004606Z 14653e4e245f: Download complete 2024-08-06T20:48:19.6829758Z 8bdbb000c39d: Verifying Checksum 2024-08-06T20:48:19.6830085Z 8bdbb000c39d: Download complete 2024-08-06T20:48:19.7753020Z 277383b63c07: Verifying Checksum 2024-08-06T20:48:19.7753329Z 277383b63c07: Download complete 2024-08-06T20:48:21.0555884Z 890313244493: Verifying Checksum 2024-08-06T20:48:21.0556276Z 890313244493: Download complete 2024-08-06T20:48:21.1404137Z f1e3cc0f57ee: Verifying Checksum 2024-08-06T20:48:21.1404544Z f1e3cc0f57ee: Download complete 2024-08-06T20:48:21.2255349Z c3cbae3fe054: Download complete 2024-08-06T20:48:21.3036178Z ccc148c4e759: Download complete 2024-08-06T20:48:21.3705020Z 7912f8c8e80d: Verifying Checksum 2024-08-06T20:48:21.3705420Z 7912f8c8e80d: Download complete 2024-08-06T20:48:21.4403094Z d166ebb28213: Download complete 2024-08-06T20:48:25.7415520Z 63bf315f789a: Download complete 2024-08-06T20:48:25.8305915Z bdb818f7b2c8: Download complete 2024-08-06T20:48:25.9160711Z 89d8aea05b3a: Verifying Checksum 2024-08-06T20:48:25.9161042Z 89d8aea05b3a: Download complete 2024-08-06T20:48:26.3195968Z f1122e19f790: Verifying Checksum 2024-08-06T20:48:26.3196362Z f1122e19f790: Download complete 2024-08-06T20:48:26.4028725Z 13d6ce3185e9: Verifying Checksum 2024-08-06T20:48:26.4029022Z 13d6ce3185e9: Download complete 2024-08-06T20:48:26.4707325Z feb3f80c392d: Verifying Checksum 2024-08-06T20:48:26.4707787Z feb3f80c392d: Download complete 2024-08-06T20:48:26.7228090Z 4fe4cdcdfbd8: Verifying Checksum 2024-08-06T20:48:26.7228474Z 4fe4cdcdfbd8: Download complete 2024-08-06T20:48:26.8032067Z be10b99d8ac8: Verifying Checksum 2024-08-06T20:48:26.8032513Z be10b99d8ac8: Download complete 2024-08-06T20:48:26.8951462Z 5980a36dfe02: Verifying Checksum 2024-08-06T20:48:26.8951932Z 5980a36dfe02: Download complete 2024-08-06T20:48:26.9687564Z 94a4e0b3f19a: Verifying Checksum 2024-08-06T20:48:26.9688322Z 94a4e0b3f19a: Download complete 2024-08-06T20:48:26.9767532Z 4f4fb700ef54: Download complete 2024-08-06T20:48:27.0444926Z 2012c603f154: Download complete 2024-08-06T20:48:27.1105535Z 060890aa9610: Download complete 2024-08-06T20:48:27.6208555Z c1a64eb8ee12: Verifying Checksum 2024-08-06T20:48:27.6209016Z c1a64eb8ee12: Download complete 2024-08-06T20:48:27.7162058Z ed7686d06f1d: Verifying Checksum 2024-08-06T20:48:27.7162466Z ed7686d06f1d: Download complete 2024-08-06T20:48:27.7924351Z 5c40be014123: Verifying Checksum 2024-08-06T20:48:27.7924668Z 5c40be014123: Download complete 2024-08-06T20:48:27.8762310Z 95c1963010ed: Verifying Checksum 2024-08-06T20:48:27.8762634Z 95c1963010ed: Download complete 2024-08-06T20:48:27.9505354Z 580500191368: Download complete 2024-08-06T20:48:31.6716481Z 7ed32bc8e469: Verifying Checksum 2024-08-06T20:48:31.6716931Z 7ed32bc8e469: Download complete 2024-08-06T20:48:31.7564631Z 859f9c7a6375: Verifying Checksum 2024-08-06T20:48:31.7565003Z 859f9c7a6375: Download complete 2024-08-06T20:48:31.8412935Z b89ac1530c4a: Verifying Checksum 2024-08-06T20:48:31.8413311Z b89ac1530c4a: Download complete 2024-08-06T20:48:31.9337747Z 4f10deed2e00: Download complete 2024-08-06T20:48:32.0160177Z 336420751f1d: Verifying Checksum 2024-08-06T20:48:32.0160618Z 336420751f1d: Download complete 2024-08-06T20:48:32.0882710Z f7f49611427c: Verifying Checksum 2024-08-06T20:48:32.0883190Z f7f49611427c: Download complete 2024-08-06T20:48:32.2174217Z 628b460c253a: Verifying Checksum 2024-08-06T20:48:32.2175134Z 628b460c253a: Download complete 2024-08-06T20:48:32.3013631Z 98e88ff10323: Verifying Checksum 2024-08-06T20:48:32.3014217Z 98e88ff10323: Download complete 2024-08-06T20:48:32.3481917Z a8c1e85b5e14: Pull complete 2024-08-06T20:48:32.3848006Z 6abf825f7962: Download complete 2024-08-06T20:48:32.4547042Z 844414c41546: Download complete 2024-08-06T20:48:32.4789002Z a41a8d1c11c8: Pull complete 2024-08-06T20:48:32.5302497Z b92a0d83e229: Verifying Checksum 2024-08-06T20:48:32.5302969Z b92a0d83e229: Download complete 2024-08-06T20:48:32.6227059Z 0c1227890755: Pull complete 2024-08-06T20:48:32.6592885Z 56e4340bc9e3: Verifying Checksum 2024-08-06T20:48:32.6593199Z 56e4340bc9e3: Download complete 2024-08-06T20:48:32.7280458Z 26f48d882588: Download complete 2024-08-06T20:48:32.7534444Z d8d1234baab3: Pull complete 2024-08-06T20:48:33.3207976Z b6fe2821ba25: Verifying Checksum 2024-08-06T20:48:33.3208409Z b6fe2821ba25: Download complete 2024-08-06T20:48:33.3957095Z fae8722cca7f: Verifying Checksum 2024-08-06T20:48:33.3957451Z fae8722cca7f: Download complete 2024-08-06T20:48:38.5358862Z d4fb7093f54f: Verifying Checksum 2024-08-06T20:48:38.5359222Z d4fb7093f54f: Download complete 2024-08-06T20:48:38.6192530Z 75a49c2f3f0a: Verifying Checksum 2024-08-06T20:48:38.6192976Z 75a49c2f3f0a: Download complete 2024-08-06T20:48:38.6914447Z b32c97699ecd: Download complete 2024-08-06T20:48:39.2081838Z b926a8516817: Verifying Checksum 2024-08-06T20:48:39.2082336Z b926a8516817: Download complete 2024-08-06T20:48:39.2820756Z 1c5d35b9a760: Verifying Checksum 2024-08-06T20:48:39.2821155Z 1c5d35b9a760: Download complete 2024-08-06T20:48:47.1706503Z b826637ebc38: Verifying Checksum 2024-08-06T20:48:47.1706841Z b826637ebc38: Download complete 2024-08-06T20:48:47.1884018Z 3c7c25c582fc: Verifying Checksum 2024-08-06T20:48:47.1884336Z 3c7c25c582fc: Download complete 2024-08-06T20:49:04.1576047Z 7ed32bc8e469: Pull complete 2024-08-06T20:49:04.3498642Z ec1e7978c1fe: Pull complete 2024-08-06T20:49:04.5694791Z 66b43372aa39: Pull complete 2024-08-06T20:49:13.2367201Z b6662193c745: Pull complete 2024-08-06T20:49:13.4616097Z 5be2b638d110: Pull complete 2024-08-06T20:49:13.6960631Z 71ca63790839: Pull complete 2024-08-06T20:49:13.8006778Z 8a74804dc4fa: Pull complete 2024-08-06T20:49:16.4314108Z 3bacb5389b74: Pull complete 2024-08-06T20:49:16.6642364Z a8911a72541a: Pull complete 2024-08-06T20:49:16.8896886Z 55d020986bb7: Pull complete 2024-08-06T20:49:17.1287762Z 679e209a81f8: Pull complete 2024-08-06T20:50:11.8200994Z d4fb7093f54f: Pull complete 2024-08-06T20:50:11.9268132Z 0d8ab4023e81: Pull complete 2024-08-06T20:50:12.1463560Z bf191f5f5a0a: Pull complete 2024-08-06T20:50:12.3817989Z 14653e4e245f: Pull complete 2024-08-06T20:50:12.5998186Z 8bdbb000c39d: Pull complete 2024-08-06T20:50:12.8140792Z 277383b63c07: Pull complete 2024-08-06T20:50:16.3833826Z 890313244493: Pull complete 2024-08-06T20:50:16.6185962Z f1e3cc0f57ee: Pull complete 2024-08-06T20:50:16.8439361Z c3cbae3fe054: Pull complete 2024-08-06T20:50:17.0778134Z ccc148c4e759: Pull complete 2024-08-06T20:50:17.3028537Z 7912f8c8e80d: Pull complete 2024-08-06T20:50:17.5264504Z d166ebb28213: Pull complete 2024-08-06T20:50:26.9917407Z 63bf315f789a: Pull complete 2024-08-06T20:50:27.2269217Z bdb818f7b2c8: Pull complete 2024-08-06T20:50:27.4489669Z 89d8aea05b3a: Pull complete 2024-08-06T20:50:28.4383820Z f1122e19f790: Pull complete 2024-08-06T20:50:28.6335433Z 13d6ce3185e9: Pull complete 2024-08-06T20:50:28.7146202Z feb3f80c392d: Pull complete 2024-08-06T20:50:29.1317927Z 4fe4cdcdfbd8: Pull complete 2024-08-06T20:50:29.2484050Z be10b99d8ac8: Pull complete 2024-08-06T20:50:29.5994126Z 5980a36dfe02: Pull complete 2024-08-06T20:50:29.8146832Z 94a4e0b3f19a: Pull complete 2024-08-06T20:50:30.0403547Z 4f4fb700ef54: Pull complete 2024-08-06T20:50:30.2658337Z 2012c603f154: Pull complete 2024-08-06T20:50:30.4888393Z 060890aa9610: Pull complete 2024-08-06T20:50:33.1596852Z c1a64eb8ee12: Pull complete 2024-08-06T20:50:33.3932855Z ed7686d06f1d: Pull complete 2024-08-06T20:50:33.5652302Z 5c40be014123: Pull complete 2024-08-06T20:50:33.8482707Z 95c1963010ed: Pull complete 2024-08-06T20:50:33.9657772Z 580500191368: Pull complete 2024-08-06T20:51:09.3390663Z b826637ebc38: Pull complete 2024-08-06T20:51:09.5580074Z 859f9c7a6375: Pull complete 2024-08-06T20:51:09.6601295Z b89ac1530c4a: Pull complete 2024-08-06T20:51:09.6947574Z 4f10deed2e00: Pull complete 2024-08-06T20:51:09.7274609Z 336420751f1d: Pull complete 2024-08-06T20:51:09.7447630Z f7f49611427c: Pull complete 2024-08-06T20:51:09.9155418Z 628b460c253a: Pull complete 2024-08-06T20:51:09.9339584Z 98e88ff10323: Pull complete 2024-08-06T20:51:09.9554017Z 6abf825f7962: Pull complete 2024-08-06T20:51:09.9731303Z 844414c41546: Pull complete 2024-08-06T20:51:09.9898043Z b92a0d83e229: Pull complete 2024-08-06T20:51:11.1052432Z 56e4340bc9e3: Pull complete 2024-08-06T20:51:11.1252423Z 26f48d882588: Pull complete 2024-08-06T20:51:13.1150415Z b6fe2821ba25: Pull complete 2024-08-06T20:51:13.2653279Z fae8722cca7f: Pull complete 2024-08-06T20:51:26.8091391Z 3c7c25c582fc: Pull complete 2024-08-06T20:51:27.0424044Z 75a49c2f3f0a: Pull complete 2024-08-06T20:51:27.2779255Z b32c97699ecd: Pull complete 2024-08-06T20:51:28.1735485Z b926a8516817: Pull complete 2024-08-06T20:51:28.4016632Z 1c5d35b9a760: Pull complete 2024-08-06T20:51:29.4509703Z Digest: sha256:00f47b036f588ca5ef8866f8635fabba5a95cdf9ff1adae7d2a674ef1d4076e9 2024-08-06T20:51:29.5026375Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:51:29.5285398Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:51:29.5346036Z ##[group]Run echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> "$GITHUB_OUTPUT" 2024-08-06T20:51:29.5346699Z echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> "$GITHUB_OUTPUT" 2024-08-06T20:51:29.5360689Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:51:29.5361030Z env: 2024-08-06T20:51:29.5361220Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:51:29.5361458Z ##[endgroup] 2024-08-06T20:51:29.5701484Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2024-08-06T20:51:29.5701863Z with: 2024-08-06T20:51:29.5702077Z driver-version: 550.54.15 2024-08-06T20:51:29.5702321Z env: 2024-08-06T20:51:29.5702513Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:51:29.5702752Z ##[endgroup] 2024-08-06T20:51:29.5814323Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2024-08-06T20:51:29.5814756Z with: 2024-08-06T20:51:29.5814954Z timeout_minutes: 10 2024-08-06T20:51:29.5815193Z max_attempts: 3 2024-08-06T20:51:29.5843811Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils if [[ "${DISTRIBUTION}" == "amzn2023" ]] ; then YUM_REPO_URL="https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo" else # Amazon Linux 2 YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" fi sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y nvidia-docker2 sudo systemctl restart docker ) } install_nvidia_docker2_ubuntu20() { ( set -x # Install nvidia-driver package if not installed status="$(dpkg-query -W --showformat='${db:Status-Status}' nvidia-docker2 2>&1)" if [ ! $? = 0 ] || [ ! "$status" = installed ]; then sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker fi ) } pre_install_nvidia_driver_amzn2() { ( # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms ) } install_nvidia_driver_common() { ( # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then # CAUTION: this may need to be updated in future if [ "${DISTRIBUTION}" != ubuntu20.04 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight fi sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi ) } post_install_nvidia_driver_common() { ( sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi # NB: Annoyingly, nvidia-smi command returns successfully with return code 0 even in # the case where the driver has already crashed as it still can get the driver version # and some basic information like the bus ID. However, the rest of the information # would be missing (ERR!), for example: # # +-----------------------------------------------------------------------------+ # | NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 | # |-------------------------------+----------------------+----------------------+ # | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | # | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | # | | | MIG M. | # |===============================+======================+======================| # | 0 ERR! Off | 00000000:00:1E.0 Off | ERR! | # |ERR! ERR! ERR! ERR! / ERR! | 4184MiB / 23028MiB | ERR! Default | # | | | ERR! | # +-------------------------------+----------------------+----------------------+ # # +-----------------------------------------------------------------------------+ # | Processes: | # | GPU GI CI PID Type Process name GPU Memory | # | ID ID Usage | # |=============================================================================| # +-----------------------------------------------------------------------------+ # # This should be reported as a failure instead as it will guarantee to fail when # Docker tries to run with --gpus all # # So, the correct check here is to query one of the missing piece of info like # GPU name, so that the command can fail accordingly nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } install_nvidia_driver_amzn2() { ( set -x pre_install_nvidia_driver_amzn2 install_nvidia_driver_common post_install_nvidia_driver_common ) } install_nvidia_driver_ubuntu20() { ( set -x install_nvidia_driver_common post_install_nvidia_driver_common ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; ubuntu20.04) install_nvidia_driver_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; ubuntu20.04) install_nvidia_docker2_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" # Fix https://github.com/NVIDIA/nvidia-docker/issues/1648 on runners with # more than one GPUs. This just needs to be run once. The command fails # on subsequent runs and complains that the mode is already on, but that's # ok sudo nvidia-persistenced || true # This should show persistence mode ON nvidia-smi 2024-08-06T20:51:29.5866903Z retry_wait_seconds: 10 2024-08-06T20:51:29.5867179Z polling_interval_seconds: 1 2024-08-06T20:51:29.5867430Z warning_on_retry: true 2024-08-06T20:51:29.5867672Z continue_on_error: false 2024-08-06T20:51:29.5867905Z env: 2024-08-06T20:51:29.5868089Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:51:29.5868341Z DRIVER_VERSION: 550.54.15 2024-08-06T20:51:29.5868572Z ##[endgroup] 2024-08-06T20:51:29.6667859Z == Installing nvidia driver NVIDIA-Linux-x86_64-550.54.15.run == 2024-08-06T20:51:29.6668950Z + pre_install_nvidia_driver_amzn2 2024-08-06T20:51:29.6671182Z + sudo yum remove -y nvidia-driver-latest-dkms 2024-08-06T20:51:29.9961382Z No match for argument: nvidia-driver-latest-dkms 2024-08-06T20:51:29.9962361Z No packages marked for removal. 2024-08-06T20:51:30.0024696Z Dependencies resolved. 2024-08-06T20:51:30.0034382Z Nothing to do. 2024-08-06T20:51:30.0034868Z Complete! 2024-08-06T20:51:30.0885633Z + install_nvidia_driver_common 2024-08-06T20:51:30.0893493Z + echo 'Before installing NVIDIA driver' 2024-08-06T20:51:30.0893925Z Before installing NVIDIA driver 2024-08-06T20:51:30.0895309Z + lspci 2024-08-06T20:51:30.1007484Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2024-08-06T20:51:30.1008129Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2024-08-06T20:51:30.1008752Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2024-08-06T20:51:30.1009252Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2024-08-06T20:51:30.1009709Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2024-08-06T20:51:30.1010391Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2024-08-06T20:51:30.1011027Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2024-08-06T20:51:30.1011518Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2024-08-06T20:51:30.1011969Z + lsmod 2024-08-06T20:51:30.1058224Z Module Size Used by 2024-08-06T20:51:30.1058650Z veth 36864 0 2024-08-06T20:51:30.1058927Z nvidia_modeset 1351680 0 2024-08-06T20:51:30.1059204Z video 65536 1 nvidia_modeset 2024-08-06T20:51:30.1059600Z wmi 36864 1 video 2024-08-06T20:51:30.1059964Z nvidia_uvm 4706304 0 2024-08-06T20:51:30.1060590Z nvidia 54071296 7 nvidia_uvm,nvidia_modeset 2024-08-06T20:51:30.1061009Z drm 602112 1 nvidia 2024-08-06T20:51:30.1061394Z drm_panel_orientation_quirks 28672 1 drm 2024-08-06T20:51:30.1061745Z backlight 24576 3 video,drm,nvidia_modeset 2024-08-06T20:51:30.1062078Z i2c_core 106496 2 nvidia,drm 2024-08-06T20:51:30.1062355Z xt_conntrack 16384 1 2024-08-06T20:51:30.1062611Z nft_chain_nat 16384 3 2024-08-06T20:51:30.1062860Z xt_MASQUERADE 20480 1 2024-08-06T20:51:30.1063138Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2024-08-06T20:51:30.1063462Z nf_conntrack_netlink 57344 0 2024-08-06T20:51:30.1063841Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2024-08-06T20:51:30.1064249Z nf_defrag_ipv6 24576 1 nf_conntrack 2024-08-06T20:51:30.1064552Z nf_defrag_ipv4 16384 1 nf_conntrack 2024-08-06T20:51:30.1064840Z xfrm_user 57344 1 2024-08-06T20:51:30.1065092Z xfrm_algo 16384 1 xfrm_user 2024-08-06T20:51:30.1065371Z xt_addrtype 16384 2 2024-08-06T20:51:30.1065608Z nft_compat 20480 4 2024-08-06T20:51:30.1065899Z nf_tables 307200 57 nft_compat,nft_chain_nat 2024-08-06T20:51:30.1066311Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2024-08-06T20:51:30.1066662Z br_netfilter 36864 0 2024-08-06T20:51:30.1066927Z bridge 307200 1 br_netfilter 2024-08-06T20:51:30.1067214Z stp 16384 1 bridge 2024-08-06T20:51:30.1067485Z llc 16384 2 bridge,stp 2024-08-06T20:51:30.1067764Z overlay 167936 0 2024-08-06T20:51:30.1068006Z tls 114688 0 2024-08-06T20:51:30.1068242Z nls_ascii 16384 1 2024-08-06T20:51:30.1068487Z nls_cp437 20480 1 2024-08-06T20:51:30.1068719Z vfat 24576 1 2024-08-06T20:51:30.1068968Z fat 86016 1 vfat 2024-08-06T20:51:30.1069226Z sunrpc 692224 1 2024-08-06T20:51:30.1069466Z ghash_clmulni_intel 16384 0 2024-08-06T20:51:30.1069717Z ena 167936 0 2024-08-06T20:51:30.1069961Z ptp 36864 1 ena 2024-08-06T20:51:30.1070223Z pps_core 24576 1 ptp 2024-08-06T20:51:30.1070485Z aesni_intel 393216 0 2024-08-06T20:51:30.1070726Z i8042 45056 0 2024-08-06T20:51:30.1070961Z serio 28672 3 i8042 2024-08-06T20:51:30.1071241Z crypto_simd 16384 1 aesni_intel 2024-08-06T20:51:30.1071518Z button 24576 0 2024-08-06T20:51:30.1071810Z cryptd 28672 2 crypto_simd,ghash_clmulni_intel 2024-08-06T20:51:30.1072137Z sch_fq_codel 20480 17 2024-08-06T20:51:30.1072387Z dm_mod 188416 0 2024-08-06T20:51:30.1072624Z dax 45056 1 dm_mod 2024-08-06T20:51:30.1072885Z loop 36864 0 2024-08-06T20:51:30.1073125Z fuse 163840 1 2024-08-06T20:51:30.1073360Z configfs 57344 1 2024-08-06T20:51:30.1073600Z dmi_sysfs 20480 0 2024-08-06T20:51:30.1073846Z crc32_pclmul 16384 0 2024-08-06T20:51:30.1074085Z crc32c_intel 24576 0 2024-08-06T20:51:30.1074329Z efivarfs 24576 1 2024-08-06T20:51:30.1074578Z + modinfo nvidia 2024-08-06T20:51:30.1076855Z filename: /lib/modules/6.1.94-99.176.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2024-08-06T20:51:30.1077297Z alias: char-major-195-* 2024-08-06T20:51:30.1077557Z version: 550.54.15 2024-08-06T20:51:30.1077905Z supported: external 2024-08-06T20:51:30.1078142Z license: NVIDIA 2024-08-06T20:51:30.1078398Z firmware: nvidia/550.54.15/gsp_tu10x.bin 2024-08-06T20:51:30.1078716Z firmware: nvidia/550.54.15/gsp_ga10x.bin 2024-08-06T20:51:30.1079019Z srcversion: 833721318DA517F0C2FEC97 2024-08-06T20:51:30.1079324Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2024-08-06T20:51:30.1079720Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2024-08-06T20:51:30.1080042Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2024-08-06T20:51:30.1080344Z depends: i2c-core,drm 2024-08-06T20:51:30.1080586Z retpoline: Y 2024-08-06T20:51:30.1080796Z name: nvidia 2024-08-06T20:51:30.1081136Z vermagic: 6.1.94-99.176.amzn2023.x86_64 SMP preempt mod_unload modversions 2024-08-06T20:51:30.1081572Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2024-08-06T20:51:30.1082009Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2024-08-06T20:51:30.1082418Z parm: NVreg_ResmanDebugLevel:int 2024-08-06T20:51:30.1082710Z parm: NVreg_RmLogonRC:int 2024-08-06T20:51:30.1083002Z parm: NVreg_ModifyDeviceFiles:int 2024-08-06T20:51:30.1083306Z parm: NVreg_DeviceFileUID:int 2024-08-06T20:51:30.1083589Z parm: NVreg_DeviceFileGID:int 2024-08-06T20:51:30.1083889Z parm: NVreg_DeviceFileMode:int 2024-08-06T20:51:30.1084241Z parm: NVreg_InitializeSystemMemoryAllocations:int 2024-08-06T20:51:30.1084612Z parm: NVreg_UsePageAttributeTable:int 2024-08-06T20:51:30.1084935Z parm: NVreg_EnablePCIeGen3:int 2024-08-06T20:51:30.1085230Z parm: NVreg_EnableMSI:int 2024-08-06T20:51:30.1085507Z parm: NVreg_TCEBypassMode:int 2024-08-06T20:51:30.1085813Z parm: NVreg_EnableStreamMemOPs:int 2024-08-06T20:51:30.1086165Z parm: NVreg_RestrictProfilingToAdminUsers:int 2024-08-06T20:51:30.1086546Z parm: NVreg_PreserveVideoMemoryAllocations:int 2024-08-06T20:51:30.1086920Z parm: NVreg_EnableS0ixPowerManagement:int 2024-08-06T20:51:30.1087317Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2024-08-06T20:51:30.1087705Z parm: NVreg_DynamicPowerManagement:int 2024-08-06T20:51:30.1088106Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2024-08-06T20:51:30.1088505Z parm: NVreg_EnableGpuFirmware:int 2024-08-06T20:51:30.1088824Z parm: NVreg_EnableGpuFirmwareLogs:int 2024-08-06T20:51:30.1089177Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2024-08-06T20:51:30.1089535Z parm: NVreg_EnableUserNUMAManagement:int 2024-08-06T20:51:30.1089855Z parm: NVreg_MemoryPoolSize:int 2024-08-06T20:51:30.1090165Z parm: NVreg_KMallocHeapMaxSize:int 2024-08-06T20:51:30.1090481Z parm: NVreg_VMallocHeapMaxSize:int 2024-08-06T20:51:30.1090786Z parm: NVreg_IgnoreMMIOCheck:int 2024-08-06T20:51:30.1091083Z parm: NVreg_NvLinkDisable:int 2024-08-06T20:51:30.1091420Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2024-08-06T20:51:30.1091760Z parm: NVreg_RegisterPCIDriver:int 2024-08-06T20:51:30.1092351Z parm: NVreg_EnableResizableBar:int 2024-08-06T20:51:30.1092708Z parm: NVreg_EnableDbgBreakpoint:int 2024-08-06T20:51:30.1093036Z parm: NVreg_EnableNonblockingOpen:int 2024-08-06T20:51:30.1093367Z parm: NVreg_RegistryDwords:charp 2024-08-06T20:51:30.1093697Z parm: NVreg_RegistryDwordsPerDevice:charp 2024-08-06T20:51:30.1094013Z parm: NVreg_RmMsg:charp 2024-08-06T20:51:30.1094390Z parm: NVreg_GpuBlacklist:charp 2024-08-06T20:51:30.1094709Z parm: NVreg_TemporaryFilePath:charp 2024-08-06T20:51:30.1095026Z parm: NVreg_ExcludedGpus:charp 2024-08-06T20:51:30.1095335Z parm: NVreg_DmaRemapPeerMmio:int 2024-08-06T20:51:30.1095652Z parm: NVreg_RmNvlinkBandwidth:charp 2024-08-06T20:51:30.1095963Z parm: NVreg_ImexChannelCount:int 2024-08-06T20:51:30.1096434Z parm: rm_firmware_active:charp 2024-08-06T20:51:30.1096731Z + HAS_NVIDIA_DRIVER=0 2024-08-06T20:51:30.1096964Z ++ command -v nvidia-smi 2024-08-06T20:51:30.1097218Z + '[' -x /usr/bin/nvidia-smi ']' 2024-08-06T20:51:30.1097475Z + set +e 2024-08-06T20:51:30.1097769Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2024-08-06T20:51:30.1379447Z + INSTALLED_DRIVER_VERSION=550.54.15 2024-08-06T20:51:30.1379744Z + NVIDIA_SMI_STATUS=0 2024-08-06T20:51:30.1380166Z + '[' 0 -ne 0 ']' 2024-08-06T20:51:30.1380412Z + '[' 550.54.15 '!=' 550.54.15 ']' 2024-08-06T20:51:30.1380671Z + HAS_NVIDIA_DRIVER=1 2024-08-06T20:51:30.1381093Z + echo 'NVIDIA driver (550.54.15) has already been installed. Skipping NVIDIA driver installation' 2024-08-06T20:51:30.1381539Z + set -e 2024-08-06T20:51:30.1381719Z + '[' 1 -eq 0 ']' 2024-08-06T20:51:30.1382088Z NVIDIA driver (550.54.15) has already been installed. Skipping NVIDIA driver installation 2024-08-06T20:51:30.1382971Z + post_install_nvidia_driver_common 2024-08-06T20:51:30.1386543Z + sudo modprobe nvidia 2024-08-06T20:51:30.2591331Z + echo 'After installing NVIDIA driver' 2024-08-06T20:51:30.2591715Z + lspci 2024-08-06T20:51:30.2591931Z After installing NVIDIA driver 2024-08-06T20:51:30.2698421Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2024-08-06T20:51:30.2699099Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2024-08-06T20:51:30.2699728Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2024-08-06T20:51:30.2700227Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2024-08-06T20:51:30.2700685Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2024-08-06T20:51:30.2701234Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2024-08-06T20:51:30.2701894Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2024-08-06T20:51:30.2702502Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2024-08-06T20:51:30.2702888Z + lsmod 2024-08-06T20:51:30.2734425Z Module Size Used by 2024-08-06T20:51:30.2735213Z veth 36864 0 2024-08-06T20:51:30.2735929Z nvidia_modeset 1351680 0 2024-08-06T20:51:30.2736685Z video 65536 1 nvidia_modeset 2024-08-06T20:51:30.2737299Z wmi 36864 1 video 2024-08-06T20:51:30.2737721Z nvidia_uvm 4706304 0 2024-08-06T20:51:30.2738117Z nvidia 54071296 7 nvidia_uvm,nvidia_modeset 2024-08-06T20:51:30.2738542Z drm 602112 1 nvidia 2024-08-06T20:51:30.2738940Z drm_panel_orientation_quirks 28672 1 drm 2024-08-06T20:51:30.2739279Z backlight 24576 3 video,drm,nvidia_modeset 2024-08-06T20:51:30.2739614Z i2c_core 106496 2 nvidia,drm 2024-08-06T20:51:30.2739895Z xt_conntrack 16384 1 2024-08-06T20:51:30.2740137Z nft_chain_nat 16384 3 2024-08-06T20:51:30.2740385Z xt_MASQUERADE 20480 1 2024-08-06T20:51:30.2740677Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2024-08-06T20:51:30.2740988Z nf_conntrack_netlink 57344 0 2024-08-06T20:51:30.2741373Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2024-08-06T20:51:30.2741790Z nf_defrag_ipv6 24576 1 nf_conntrack 2024-08-06T20:51:30.2742085Z nf_defrag_ipv4 16384 1 nf_conntrack 2024-08-06T20:51:30.2742368Z xfrm_user 57344 1 2024-08-06T20:51:30.2742623Z xfrm_algo 16384 1 xfrm_user 2024-08-06T20:51:30.2742891Z xt_addrtype 16384 2 2024-08-06T20:51:30.2743138Z nft_compat 20480 4 2024-08-06T20:51:30.2743430Z nf_tables 307200 57 nft_compat,nft_chain_nat 2024-08-06T20:51:30.2743818Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2024-08-06T20:51:30.2744177Z br_netfilter 36864 0 2024-08-06T20:51:30.2744439Z bridge 307200 1 br_netfilter 2024-08-06T20:51:30.2745001Z stp 16384 1 bridge 2024-08-06T20:51:30.2745281Z llc 16384 2 bridge,stp 2024-08-06T20:51:30.2745550Z overlay 167936 0 2024-08-06T20:51:30.2745781Z tls 114688 0 2024-08-06T20:51:30.2746018Z nls_ascii 16384 1 2024-08-06T20:51:30.2746257Z nls_cp437 20480 1 2024-08-06T20:51:30.2746647Z vfat 24576 1 2024-08-06T20:51:30.2746898Z fat 86016 1 vfat 2024-08-06T20:51:30.2747152Z sunrpc 692224 1 2024-08-06T20:51:30.2747392Z ghash_clmulni_intel 16384 0 2024-08-06T20:51:30.2747635Z ena 167936 0 2024-08-06T20:51:30.2747874Z ptp 36864 1 ena 2024-08-06T20:51:30.2748127Z pps_core 24576 1 ptp 2024-08-06T20:51:30.2748389Z aesni_intel 393216 0 2024-08-06T20:51:30.2748623Z i8042 45056 0 2024-08-06T20:51:30.2750340Z serio 28672 3 i8042 2024-08-06T20:51:30.2750621Z crypto_simd 16384 1 aesni_intel 2024-08-06T20:51:30.2750894Z button 24576 0 2024-08-06T20:51:30.2751178Z cryptd 28672 2 crypto_simd,ghash_clmulni_intel 2024-08-06T20:51:30.2751502Z sch_fq_codel 20480 17 2024-08-06T20:51:30.2751747Z dm_mod 188416 0 2024-08-06T20:51:30.2751985Z dax 45056 1 dm_mod 2024-08-06T20:51:30.2752243Z loop 36864 0 2024-08-06T20:51:30.2752477Z fuse 163840 1 2024-08-06T20:51:30.2752706Z configfs 57344 1 2024-08-06T20:51:30.2752943Z dmi_sysfs 20480 0 2024-08-06T20:51:30.2753193Z crc32_pclmul 16384 0 2024-08-06T20:51:30.2770617Z crc32c_intel 24576 0 2024-08-06T20:51:30.2770943Z efivarfs 24576 1 2024-08-06T20:51:30.2771223Z + modinfo nvidia 2024-08-06T20:51:30.2771661Z filename: /lib/modules/6.1.94-99.176.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2024-08-06T20:51:30.2772103Z alias: char-major-195-* 2024-08-06T20:51:30.2772367Z version: 550.54.15 2024-08-06T20:51:30.2772620Z supported: external 2024-08-06T20:51:30.2772861Z license: NVIDIA 2024-08-06T20:51:30.2773112Z firmware: nvidia/550.54.15/gsp_tu10x.bin 2024-08-06T20:51:30.2773442Z firmware: nvidia/550.54.15/gsp_ga10x.bin 2024-08-06T20:51:30.2773761Z srcversion: 833721318DA517F0C2FEC97 2024-08-06T20:51:30.2774064Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2024-08-06T20:51:30.2774579Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2024-08-06T20:51:30.2774906Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2024-08-06T20:51:30.2775208Z depends: i2c-core,drm 2024-08-06T20:51:30.2775494Z retpoline: Y 2024-08-06T20:51:30.2775709Z name: nvidia 2024-08-06T20:51:30.2776070Z vermagic: 6.1.94-99.176.amzn2023.x86_64 SMP preempt mod_unload modversions 2024-08-06T20:51:30.2776517Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2024-08-06T20:51:30.2776957Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2024-08-06T20:51:30.2777360Z parm: NVreg_ResmanDebugLevel:int 2024-08-06T20:51:30.2777666Z parm: NVreg_RmLogonRC:int 2024-08-06T20:51:30.2777964Z parm: NVreg_ModifyDeviceFiles:int 2024-08-06T20:51:30.2778262Z parm: NVreg_DeviceFileUID:int 2024-08-06T20:51:30.2778571Z parm: NVreg_DeviceFileGID:int 2024-08-06T20:51:30.2778872Z parm: NVreg_DeviceFileMode:int 2024-08-06T20:51:30.2779220Z parm: NVreg_InitializeSystemMemoryAllocations:int 2024-08-06T20:51:30.2779602Z parm: NVreg_UsePageAttributeTable:int 2024-08-06T20:51:30.2779931Z parm: NVreg_EnablePCIeGen3:int 2024-08-06T20:51:30.2780223Z parm: NVreg_EnableMSI:int 2024-08-06T20:51:30.2780511Z parm: NVreg_TCEBypassMode:int 2024-08-06T20:51:30.2780824Z parm: NVreg_EnableStreamMemOPs:int 2024-08-06T20:51:30.2781175Z parm: NVreg_RestrictProfilingToAdminUsers:int 2024-08-06T20:51:30.2781735Z parm: NVreg_PreserveVideoMemoryAllocations:int 2024-08-06T20:51:30.2782125Z parm: NVreg_EnableS0ixPowerManagement:int 2024-08-06T20:51:30.2782529Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2024-08-06T20:51:30.2782936Z parm: NVreg_DynamicPowerManagement:int 2024-08-06T20:51:30.2783453Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2024-08-06T20:51:30.2783848Z parm: NVreg_EnableGpuFirmware:int 2024-08-06T20:51:30.2784179Z parm: NVreg_EnableGpuFirmwareLogs:int 2024-08-06T20:51:30.2784541Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2024-08-06T20:51:30.2784899Z parm: NVreg_EnableUserNUMAManagement:int 2024-08-06T20:51:30.2785235Z parm: NVreg_MemoryPoolSize:int 2024-08-06T20:51:30.2785552Z parm: NVreg_KMallocHeapMaxSize:int 2024-08-06T20:51:30.2785870Z parm: NVreg_VMallocHeapMaxSize:int 2024-08-06T20:51:30.2786189Z parm: NVreg_IgnoreMMIOCheck:int 2024-08-06T20:51:30.2786506Z parm: NVreg_NvLinkDisable:int 2024-08-06T20:51:30.2786840Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2024-08-06T20:51:30.2787206Z parm: NVreg_RegisterPCIDriver:int 2024-08-06T20:51:30.2787522Z parm: NVreg_EnableResizableBar:int 2024-08-06T20:51:30.2808558Z parm: NVreg_EnableDbgBreakpoint:int 2024-08-06T20:51:30.2808936Z parm: NVreg_EnableNonblockingOpen:int 2024-08-06T20:51:30.2809253Z parm: NVreg_RegistryDwords:charp 2024-08-06T20:51:30.2809575Z parm: NVreg_RegistryDwordsPerDevice:charp 2024-08-06T20:51:30.2809882Z parm: NVreg_RmMsg:charp 2024-08-06T20:51:30.2810146Z parm: NVreg_GpuBlacklist:charp 2024-08-06T20:51:30.2810445Z parm: NVreg_TemporaryFilePath:charp 2024-08-06T20:51:30.2810741Z parm: NVreg_ExcludedGpus:charp 2024-08-06T20:51:30.2811030Z parm: NVreg_DmaRemapPeerMmio:int 2024-08-06T20:51:30.2811330Z parm: NVreg_RmNvlinkBandwidth:charp 2024-08-06T20:51:30.2811636Z parm: NVreg_ImexChannelCount:int 2024-08-06T20:51:30.2811920Z parm: rm_firmware_active:charp 2024-08-06T20:51:30.2812171Z + set +e 2024-08-06T20:51:30.2812341Z + nvidia-smi 2024-08-06T20:51:30.2968138Z Tue Aug 6 20:51:30 2024 2024-08-06T20:51:30.2968488Z +-----------------------------------------------------------------------------------------+ 2024-08-06T20:51:30.2968968Z | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | 2024-08-06T20:51:30.2969422Z |-----------------------------------------+------------------------+----------------------+ 2024-08-06T20:51:30.2969886Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2024-08-06T20:51:30.2970380Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2024-08-06T20:51:30.2970785Z | | | MIG M. | 2024-08-06T20:51:30.2971101Z |=========================================+========================+======================| 2024-08-06T20:51:30.3178026Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2024-08-06T20:51:30.3178448Z | 0% 32C P8 16W / 300W | 0MiB / 23028MiB | 0% Default | 2024-08-06T20:51:30.3178819Z | | | N/A | 2024-08-06T20:51:30.3179185Z +-----------------------------------------+------------------------+----------------------+ 2024-08-06T20:51:30.3183041Z 2024-08-06T20:51:30.3183417Z +-----------------------------------------------------------------------------------------+ 2024-08-06T20:51:30.3183816Z | Processes: | 2024-08-06T20:51:30.3184480Z | GPU GI CI PID Type Process name GPU Memory | 2024-08-06T20:51:30.3184882Z | ID ID Usage | 2024-08-06T20:51:30.3185199Z |=========================================================================================| 2024-08-06T20:51:30.3190017Z | No running processes found | 2024-08-06T20:51:30.3190600Z +-----------------------------------------------------------------------------------------+ 2024-08-06T20:51:30.6233597Z + nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 2024-08-06T20:51:30.6425547Z NVIDIA A10G 2024-08-06T20:51:30.6490095Z + NVIDIA_SMI_STATUS=0 2024-08-06T20:51:30.6490385Z + '[' 0 -eq 0 ']' 2024-08-06T20:51:30.6490618Z + echo 'INFO: Ignoring allowed status 0' 2024-08-06T20:51:30.6490902Z + set -e 2024-08-06T20:51:30.6491107Z INFO: Ignoring allowed status 0 2024-08-06T20:51:30.6499450Z == Installing nvidia container toolkit for amzn2023 == 2024-08-06T20:51:30.6505933Z + sudo yum install -y yum-utils 2024-08-06T20:51:31.1045356Z Last metadata expiration check: 1:02:11 ago on Tue Aug 6 19:49:20 2024. 2024-08-06T20:51:31.1265492Z Package dnf-utils-4.3.0-13.amzn2023.0.4.noarch is already installed. 2024-08-06T20:51:31.1570419Z Dependencies resolved. 2024-08-06T20:51:31.1696039Z Nothing to do. 2024-08-06T20:51:31.1696266Z Complete! 2024-08-06T20:51:31.2680956Z + [[ amzn2023 == \a\m\z\n\2\0\2\3 ]] 2024-08-06T20:51:31.2681553Z + YUM_REPO_URL=https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2024-08-06T20:51:31.2682395Z + sudo yum-config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2024-08-06T20:51:31.5266859Z Adding repo from: https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2024-08-06T20:51:31.6039700Z + sudo yum install -y nvidia-docker2 2024-08-06T20:51:32.0625796Z nvidia-container-toolkit 11 kB/s | 833 B 00:00 2024-08-06T20:51:32.0844960Z Package nvidia-docker2-2.14.0-1.noarch is already installed. 2024-08-06T20:51:32.1146102Z Dependencies resolved. 2024-08-06T20:51:32.1269163Z Nothing to do. 2024-08-06T20:51:32.1269387Z Complete! 2024-08-06T20:51:32.2170097Z + sudo systemctl restart docker 2024-08-06T20:52:13.5867116Z nvidia-persistenced failed to initialize. Check syslog for more details. 2024-08-06T20:52:13.6100002Z Tue Aug 6 20:52:13 2024 2024-08-06T20:52:13.6100586Z +-----------------------------------------------------------------------------------------+ 2024-08-06T20:52:13.6101304Z | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | 2024-08-06T20:52:13.6101984Z |-----------------------------------------+------------------------+----------------------+ 2024-08-06T20:52:13.6102682Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2024-08-06T20:52:13.6103465Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2024-08-06T20:52:13.6104071Z | | | MIG M. | 2024-08-06T20:52:13.6104532Z |=========================================+========================+======================| 2024-08-06T20:52:13.6299911Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2024-08-06T20:52:13.6300523Z | 0% 32C P8 16W / 300W | 0MiB / 23028MiB | 0% Default | 2024-08-06T20:52:13.6301016Z | | | N/A | 2024-08-06T20:52:13.6301544Z +-----------------------------------------+------------------------+----------------------+ 2024-08-06T20:52:13.6304929Z 2024-08-06T20:52:13.6305470Z +-----------------------------------------------------------------------------------------+ 2024-08-06T20:52:13.6306493Z | Processes: | 2024-08-06T20:52:13.6307129Z | GPU GI CI PID Type Process name GPU Memory | 2024-08-06T20:52:13.6307716Z | ID ID Usage | 2024-08-06T20:52:13.6308354Z |=========================================================================================| 2024-08-06T20:52:13.6311886Z | No running processes found | 2024-08-06T20:52:13.6312525Z +-----------------------------------------------------------------------------------------+ 2024-08-06T20:52:14.6868360Z Command completed after 1 attempt(s). 2024-08-06T20:52:14.6943605Z ##[group]Run python3 -m pip install psutil==5.9.1 nvidia-ml-py==11.525.84 2024-08-06T20:52:14.6944143Z python3 -m pip install psutil==5.9.1 nvidia-ml-py==11.525.84 2024-08-06T20:52:14.6944585Z python3 -m tools.stats.monitor > usage_log.txt 2>&1 & 2024-08-06T20:52:14.6945005Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2024-08-06T20:52:14.6957668Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:14.6958001Z env: 2024-08-06T20:52:14.6958195Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:14.6958510Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:14.6958816Z ##[endgroup] 2024-08-06T20:52:14.9444462Z Defaulting to user installation because normal site-packages is not writeable 2024-08-06T20:52:14.9631123Z Requirement already satisfied: psutil==5.9.1 in /home/ec2-user/.local/lib/python3.9/site-packages (5.9.1) 2024-08-06T20:52:14.9636350Z Requirement already satisfied: nvidia-ml-py==11.525.84 in /home/ec2-user/.local/lib/python3.9/site-packages (11.525.84) 2024-08-06T20:52:15.1009823Z Prepare all required actions 2024-08-06T20:52:15.1010322Z Getting action download info 2024-08-06T20:52:15.2066194Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2024-08-06T20:52:15.4394160Z Download action repository 'actions/download-artifact@v3' (SHA:9bc31d5ccc31df68ecc42ccf4149144866c47d8a) 2024-08-06T20:52:15.5950129Z ##[group]Run ./.github/actions/download-build-artifacts 2024-08-06T20:52:15.5950471Z with: 2024-08-06T20:52:15.5950700Z name: linux-focal-cuda12.1-py3.10-gcc9-sm86 2024-08-06T20:52:15.5950996Z s3-bucket: gha-artifacts 2024-08-06T20:52:15.5951229Z env: 2024-08-06T20:52:15.5951420Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:15.5951711Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:15.5952028Z ##[endgroup] 2024-08-06T20:52:15.5985877Z ##[group]Run seemethere/download-artifact-s3@v4 2024-08-06T20:52:15.5986187Z with: 2024-08-06T20:52:15.5986443Z name: linux-focal-cuda12.1-py3.10-gcc9-sm86 2024-08-06T20:52:15.5986750Z s3-bucket: gha-artifacts 2024-08-06T20:52:15.5986990Z region: us-east-1 2024-08-06T20:52:15.5987195Z env: 2024-08-06T20:52:15.5987385Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:15.5987687Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:15.5988005Z ##[endgroup] 2024-08-06T20:52:16.0858917Z (node:557235) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2024-08-06T20:52:16.0859404Z 2024-08-06T20:52:16.0859581Z Please migrate your code to use AWS SDK for JavaScript (v3). 2024-08-06T20:52:16.0860075Z For more information, check the migration guide at https://a.co/7PzMCcy 2024-08-06T20:52:16.0860587Z (Use `node --trace-warnings ...` to show where the warning was created) 2024-08-06T20:52:16.1663273Z Found 1 objects with prefix pytorch/pytorch/10273124344/linux-focal-cuda12.1-py3.10-gcc9-sm86/ 2024-08-06T20:52:16.1664005Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2024-08-06T20:52:38.3110376Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2024-08-06T20:52:38.3116880Z Artifact download has finished successfully 2024-08-06T20:52:38.3474502Z ##[group]Run unzip -o artifacts.zip 2024-08-06T20:52:38.3474815Z unzip -o artifacts.zip 2024-08-06T20:52:38.3483814Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:38.3484167Z env: 2024-08-06T20:52:38.3484597Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:38.3484897Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:38.3485219Z ##[endgroup] 2024-08-06T20:52:38.3529686Z Archive: artifacts.zip 2024-08-06T20:52:38.3531290Z creating: dist/ 2024-08-06T20:52:40.4856672Z inflating: dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl 2024-08-06T20:52:40.4858688Z creating: build/custom_test_artifacts/ 2024-08-06T20:52:40.4859288Z creating: build/custom_test_artifacts/custom-op-build/ 2024-08-06T20:52:40.4859785Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2024-08-06T20:52:40.4860310Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2024-08-06T20:52:40.4867000Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2024-08-06T20:52:40.4867728Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/ 2024-08-06T20:52:40.4868509Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-08-06T20:52:40.4869266Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-08-06T20:52:40.4869886Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-08-06T20:52:40.4871865Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-08-06T20:52:40.4874297Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-08-06T20:52:40.4875040Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-08-06T20:52:40.4875680Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-08-06T20:52:40.4877507Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-08-06T20:52:40.4879199Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-08-06T20:52:40.4881265Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-08-06T20:52:40.4882690Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-08-06T20:52:40.4884825Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-08-06T20:52:40.4886103Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-08-06T20:52:40.4886927Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/ 2024-08-06T20:52:40.4887633Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/ 2024-08-06T20:52:40.4929227Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2024-08-06T20:52:40.4968069Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2024-08-06T20:52:40.4969176Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2024-08-06T20:52:40.5015575Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2024-08-06T20:52:40.5016694Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2024-08-06T20:52:40.5018204Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2024-08-06T20:52:40.5019492Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2024-08-06T20:52:40.5020633Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2024-08-06T20:52:40.5021957Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2024-08-06T20:52:40.5023077Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2024-08-06T20:52:40.5024144Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2024-08-06T20:52:40.5025205Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2024-08-06T20:52:40.5026227Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2024-08-06T20:52:40.5040689Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.reg.c 2024-08-06T20:52:40.5041528Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin 2024-08-06T20:52:40.5042339Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2024-08-06T20:52:40.5043135Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.o 2024-08-06T20:52:40.5043931Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/CMakeCUDACompilerId.cu 2024-08-06T20:52:40.5091474Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/a.out 2024-08-06T20:52:40.5151999Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CUDA.bin 2024-08-06T20:52:40.5152957Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCUDACompiler.cmake 2024-08-06T20:52:40.5153735Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2024-08-06T20:52:40.5154439Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2024-08-06T20:52:40.5155156Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2024-08-06T20:52:40.5155956Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2024-08-06T20:52:40.5156771Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2024-08-06T20:52:40.5157706Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2024-08-06T20:52:40.5158609Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2024-08-06T20:52:40.5159401Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2024-08-06T20:52:40.5160275Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2024-08-06T20:52:40.5161087Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2024-08-06T20:52:40.5161941Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2024-08-06T20:52:40.5162797Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2024-08-06T20:52:40.5163540Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2024-08-06T20:52:40.5184430Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2024-08-06T20:52:40.5329118Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2024-08-06T20:52:40.5330968Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2024-08-06T20:52:40.5332855Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2024-08-06T20:52:40.5335123Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2024-08-06T20:52:40.5336462Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2024-08-06T20:52:40.5337241Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2024-08-06T20:52:40.5337967Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2024-08-06T20:52:40.5338707Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2024-08-06T20:52:40.5339440Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2024-08-06T20:52:40.5340161Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2024-08-06T20:52:40.5340880Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2024-08-06T20:52:40.5357896Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2024-08-06T20:52:40.5442741Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2024-08-06T20:52:40.5444759Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-08-06T20:52:40.5446365Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2024-08-06T20:52:40.5447317Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2024-08-06T20:52:40.5447974Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2024-08-06T20:52:40.5448562Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2024-08-06T20:52:40.5449144Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2024-08-06T20:52:40.5450947Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2024-08-06T20:52:40.5451886Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2024-08-06T20:52:40.5452807Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2024-08-06T20:52:40.5572826Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2024-08-06T20:52:40.5636351Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2024-08-06T20:52:40.5636984Z creating: build/custom_test_artifacts/jit-hook-build/ 2024-08-06T20:52:40.5637559Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2024-08-06T20:52:40.5638149Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2024-08-06T20:52:40.5645441Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2024-08-06T20:52:40.5646161Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/ 2024-08-06T20:52:40.5647180Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-08-06T20:52:40.5647926Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-08-06T20:52:40.5648591Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-08-06T20:52:40.5651031Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-08-06T20:52:40.5652658Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-08-06T20:52:40.5653432Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-08-06T20:52:40.5654128Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-08-06T20:52:40.5656729Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-08-06T20:52:40.5658228Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-08-06T20:52:40.5660443Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-08-06T20:52:40.5661402Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-08-06T20:52:40.5663318Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-08-06T20:52:40.5664765Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-08-06T20:52:40.5665589Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/ 2024-08-06T20:52:40.5666349Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/ 2024-08-06T20:52:40.5710260Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2024-08-06T20:52:40.5750021Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2024-08-06T20:52:40.5751068Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2024-08-06T20:52:40.5797408Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2024-08-06T20:52:40.5798497Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2024-08-06T20:52:40.5799875Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2024-08-06T20:52:40.5801052Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2024-08-06T20:52:40.5802163Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2024-08-06T20:52:40.5803272Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2024-08-06T20:52:40.5804364Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2024-08-06T20:52:40.5805438Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2024-08-06T20:52:40.5806534Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2024-08-06T20:52:40.5807568Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2024-08-06T20:52:40.5808546Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.reg.c 2024-08-06T20:52:40.5809529Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin 2024-08-06T20:52:40.5810455Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2024-08-06T20:52:40.5811258Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.o 2024-08-06T20:52:40.5812235Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/CMakeCUDACompilerId.cu 2024-08-06T20:52:40.5873045Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/a.out 2024-08-06T20:52:40.5933520Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CUDA.bin 2024-08-06T20:52:40.5934567Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCUDACompiler.cmake 2024-08-06T20:52:40.5935315Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2024-08-06T20:52:40.5936049Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2024-08-06T20:52:40.5936685Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2024-08-06T20:52:40.5937711Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2024-08-06T20:52:40.5938536Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2024-08-06T20:52:40.5939522Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2024-08-06T20:52:40.5940458Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2024-08-06T20:52:40.5941246Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2024-08-06T20:52:40.5942181Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2024-08-06T20:52:40.5943076Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2024-08-06T20:52:40.5943986Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2024-08-06T20:52:40.5944806Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2024-08-06T20:52:40.5945516Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2024-08-06T20:52:40.5965880Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2024-08-06T20:52:40.6030678Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2024-08-06T20:52:40.6031638Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-08-06T20:52:40.6032732Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2024-08-06T20:52:40.6033546Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2024-08-06T20:52:40.6034473Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2024-08-06T20:52:40.6035908Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2024-08-06T20:52:40.6036645Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2024-08-06T20:52:40.6039851Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2024-08-06T20:52:40.6040448Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2024-08-06T20:52:40.6041533Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2024-08-06T20:52:40.6091516Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2024-08-06T20:52:40.6092229Z creating: build/custom_test_artifacts/custom-backend-build/ 2024-08-06T20:52:40.6092785Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2024-08-06T20:52:40.6093405Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2024-08-06T20:52:40.6100838Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2024-08-06T20:52:40.6101621Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/ 2024-08-06T20:52:40.6102382Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-08-06T20:52:40.6103145Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-08-06T20:52:40.6103817Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-08-06T20:52:40.6105473Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-08-06T20:52:40.6107626Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-08-06T20:52:40.6108447Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-08-06T20:52:40.6109333Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-08-06T20:52:40.6110966Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-08-06T20:52:40.6113165Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-08-06T20:52:40.6114751Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-08-06T20:52:40.6115743Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-08-06T20:52:40.6117717Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-08-06T20:52:40.6119044Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-08-06T20:52:40.6119921Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/ 2024-08-06T20:52:40.6120687Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/ 2024-08-06T20:52:40.6160967Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2024-08-06T20:52:40.6201204Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2024-08-06T20:52:40.6202450Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2024-08-06T20:52:40.6248074Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2024-08-06T20:52:40.6249188Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2024-08-06T20:52:40.6250451Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2024-08-06T20:52:40.6251667Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2024-08-06T20:52:40.6252890Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2024-08-06T20:52:40.6254030Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2024-08-06T20:52:40.6255226Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2024-08-06T20:52:40.6256365Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2024-08-06T20:52:40.6257430Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2024-08-06T20:52:40.6258493Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2024-08-06T20:52:40.6259582Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.reg.c 2024-08-06T20:52:40.6260533Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin 2024-08-06T20:52:40.6261568Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2024-08-06T20:52:40.6262472Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.o 2024-08-06T20:52:40.6263521Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/CMakeCUDACompilerId.cu 2024-08-06T20:52:40.6324949Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/a.out 2024-08-06T20:52:40.6385039Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CUDA.bin 2024-08-06T20:52:40.6386408Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCUDACompiler.cmake 2024-08-06T20:52:40.6387284Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2024-08-06T20:52:40.6388029Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2024-08-06T20:52:40.6388812Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2024-08-06T20:52:40.6389593Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2024-08-06T20:52:40.6390562Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2024-08-06T20:52:40.6391586Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2024-08-06T20:52:40.6392721Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2024-08-06T20:52:40.6393656Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2024-08-06T20:52:40.6394592Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2024-08-06T20:52:40.6395717Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2024-08-06T20:52:40.6396800Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2024-08-06T20:52:40.6397585Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2024-08-06T20:52:40.6398357Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2024-08-06T20:52:40.6401495Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2024-08-06T20:52:40.6527780Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2024-08-06T20:52:40.6528674Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2024-08-06T20:52:40.6529665Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2024-08-06T20:52:40.6530714Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2024-08-06T20:52:40.6531757Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2024-08-06T20:52:40.6532625Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2024-08-06T20:52:40.6533667Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2024-08-06T20:52:40.6534817Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2024-08-06T20:52:40.6535802Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2024-08-06T20:52:40.6536699Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2024-08-06T20:52:40.6537508Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2024-08-06T20:52:40.6557580Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2024-08-06T20:52:40.6613628Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2024-08-06T20:52:40.6614765Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-08-06T20:52:40.6616137Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2024-08-06T20:52:40.6616827Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2024-08-06T20:52:40.6617689Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2024-08-06T20:52:40.6619325Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2024-08-06T20:52:40.6619945Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2024-08-06T20:52:40.6622756Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2024-08-06T20:52:40.6623782Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2024-08-06T20:52:40.6624596Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2024-08-06T20:52:40.6730098Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2024-08-06T20:52:40.6772777Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2024-08-06T20:52:40.6773205Z creating: build/lib/ 2024-08-06T20:52:40.6862567Z inflating: build/lib/libprotobuf-lite.a 2024-08-06T20:52:40.7321673Z inflating: build/lib/libprotobuf.a 2024-08-06T20:52:40.7331420Z inflating: build/lib/libpthreadpool.a 2024-08-06T20:52:40.7339957Z inflating: build/lib/libcpuinfo.a 2024-08-06T20:52:40.7348286Z inflating: build/lib/libcpuinfo_internals.a 2024-08-06T20:52:40.7348972Z inflating: build/lib/libclog.a 2024-08-06T20:52:40.7367483Z inflating: build/lib/libnnpack.a 2024-08-06T20:52:40.7370285Z inflating: build/lib/libnnpack_reference_layers.a 2024-08-06T20:52:40.7435980Z inflating: build/lib/libgtest.a 2024-08-06T20:52:40.7511525Z inflating: build/lib/libbenchmark.a 2024-08-06T20:52:40.7576315Z inflating: build/lib/libasmjit.a 2024-08-06T20:52:40.7583503Z inflating: build/lib/libittnotify.a 2024-08-06T20:52:40.7611827Z inflating: build/lib/libtensorpipe_uv.a 2024-08-06T20:52:40.7739294Z inflating: build/lib/libgloo.a 2024-08-06T20:52:40.7760465Z inflating: build/lib/libfmt.a 2024-08-06T20:52:40.7855254Z inflating: build/lib/libc10.so 2024-08-06T20:52:40.7857167Z inflating: build/lib/libcaffe2_nvrtc.so 2024-08-06T20:52:40.7858017Z inflating: build/lib/libfoxi_loader.a 2024-08-06T20:52:40.7859914Z inflating: build/lib/libtorch_global_deps.so 2024-08-06T20:52:40.7879443Z inflating: build/lib/libpytorch_qnnpack.a 2024-08-06T20:52:40.7898221Z inflating: build/lib/libgmock.a 2024-08-06T20:52:40.7898794Z inflating: build/lib/libgtest_main.a 2024-08-06T20:52:40.7899838Z inflating: build/lib/libbenchmark_main.a 2024-08-06T20:52:40.8406168Z inflating: build/lib/libprotoc.a 2024-08-06T20:52:40.8796240Z inflating: build/lib/libgloo_cuda.a 2024-08-06T20:52:41.9000253Z inflating: build/lib/libdnnl.a 2024-08-06T20:52:41.9576404Z inflating: build/lib/libtensorpipe.a 2024-08-06T20:52:41.9634020Z inflating: build/lib/libc10_cuda.so 2024-08-06T20:52:41.9634642Z inflating: build/lib/libgmock_main.a 2024-08-06T20:52:42.0895442Z inflating: build/lib/libfbgemm.a 2024-08-06T20:52:42.1152604Z inflating: build/lib/libtensorpipe_cuda.a 2024-08-06T20:52:42.1658955Z inflating: build/lib/libkineto.a 2024-08-06T20:52:42.1843386Z inflating: build/lib/libXNNPACK.a 2024-08-06T20:52:42.1886691Z inflating: build/lib/libonnx_proto.a 2024-08-06T20:52:42.2581005Z inflating: build/lib/libonnx.a 2024-08-06T20:52:44.6937108Z inflating: build/lib/libtorch_cpu.so 2024-08-06T20:52:44.6941669Z inflating: build/lib/libunbox_lib.a 2024-08-06T20:52:44.6945945Z inflating: build/lib/libshm.so 2024-08-06T20:52:46.7693088Z inflating: build/lib/libtorch_cuda.so 2024-08-06T20:52:46.7694790Z inflating: build/lib/libtorch.so 2024-08-06T20:52:47.5926333Z inflating: build/lib/libtorch_cuda_linalg.so 2024-08-06T20:52:47.5928945Z inflating: build/lib/libc10d_cuda_test.so 2024-08-06T20:52:47.7923492Z inflating: build/lib/libtorch_python.so 2024-08-06T20:52:47.7996749Z inflating: build/lib/libtorchbind_test.so 2024-08-06T20:52:47.8017267Z inflating: build/lib/libjitbackend_test.so 2024-08-06T20:52:47.8043101Z inflating: build/lib/libbackend_with_compiler.so 2024-08-06T20:52:47.8068814Z inflating: build/lib/libaoti_custom_ops.so 2024-08-06T20:52:47.8103904Z inflating: build/lib/libnnapi_backend.so 2024-08-06T20:52:47.8104204Z creating: build/bin/ 2024-08-06T20:52:47.8155438Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2024-08-06T20:52:47.8206932Z inflating: build/bin/c10_DeviceGuard_test 2024-08-06T20:52:47.8257562Z inflating: build/bin/c10_Device_test 2024-08-06T20:52:47.8317177Z inflating: build/bin/c10_DispatchKeySet_test 2024-08-06T20:52:47.8370818Z inflating: build/bin/c10_Scalar_test 2024-08-06T20:52:47.8421951Z inflating: build/bin/c10_StreamGuard_test 2024-08-06T20:52:47.8473654Z inflating: build/bin/c10_SymInt_test 2024-08-06T20:52:47.8528908Z inflating: build/bin/c10_InlineDeviceGuard_test 2024-08-06T20:52:47.8585124Z inflating: build/bin/c10_InlineStreamGuard_test 2024-08-06T20:52:47.8642107Z inflating: build/bin/c10_SizesAndStrides_test 2024-08-06T20:52:47.8713651Z inflating: build/bin/c10_cow_test 2024-08-06T20:52:47.8767678Z inflating: build/bin/c10_Bitset_test 2024-08-06T20:52:47.8817003Z inflating: build/bin/c10_ConstexprCrc_test 2024-08-06T20:52:47.8867532Z inflating: build/bin/c10_DeadlockDetection_test 2024-08-06T20:52:47.8918740Z inflating: build/bin/c10_Half_test 2024-08-06T20:52:47.8975644Z inflating: build/bin/c10_LeftRight_test 2024-08-06T20:52:47.9031359Z inflating: build/bin/c10_Metaprogramming_test 2024-08-06T20:52:47.9081883Z inflating: build/bin/c10_Synchronized_test 2024-08-06T20:52:47.9138017Z inflating: build/bin/c10_ThreadLocal_test 2024-08-06T20:52:47.9189820Z inflating: build/bin/c10_TypeIndex_test 2024-08-06T20:52:47.9241730Z inflating: build/bin/c10_TypeList_test 2024-08-06T20:52:47.9290933Z inflating: build/bin/c10_TypeTraits_test 2024-08-06T20:52:47.9343195Z inflating: build/bin/c10_accumulate_test 2024-08-06T20:52:47.9399120Z inflating: build/bin/c10_bfloat16_test 2024-08-06T20:52:47.9450699Z inflating: build/bin/c10_bit_cast_test 2024-08-06T20:52:47.9507273Z inflating: build/bin/c10_complex_math_test 2024-08-06T20:52:47.9563149Z inflating: build/bin/c10_complex_test 2024-08-06T20:52:47.9618086Z inflating: build/bin/c10_exception_test 2024-08-06T20:52:47.9668434Z inflating: build/bin/c10_flags_test 2024-08-06T20:52:47.9719038Z inflating: build/bin/c10_generic_math_test 2024-08-06T20:52:47.9885033Z inflating: build/bin/c10_intrusive_ptr_test 2024-08-06T20:52:47.9936328Z inflating: build/bin/c10_irange_test 2024-08-06T20:52:47.9989657Z inflating: build/bin/c10_lazy_test 2024-08-06T20:52:48.0047655Z inflating: build/bin/c10_logging_test 2024-08-06T20:52:48.0123529Z inflating: build/bin/c10_optional_test 2024-08-06T20:52:48.0185817Z inflating: build/bin/c10_ordered_preserving_dict_test 2024-08-06T20:52:48.0240509Z inflating: build/bin/c10_registry_test 2024-08-06T20:52:48.0396948Z inflating: build/bin/c10_small_vector_test 2024-08-06T20:52:48.0451917Z inflating: build/bin/c10_ssize_test 2024-08-06T20:52:48.0506033Z inflating: build/bin/c10_string_util_test 2024-08-06T20:52:48.0565195Z inflating: build/bin/c10_string_view_test 2024-08-06T20:52:48.0618006Z inflating: build/bin/c10_tempfile_test 2024-08-06T20:52:48.0666780Z inflating: build/bin/c10_intrusive_ptr_benchmark 2024-08-06T20:52:48.0723660Z inflating: build/bin/c10_typeid_test 2024-08-06T20:52:48.1174975Z inflating: build/bin/protoc-3.13.0.0 2024-08-06T20:52:48.1624142Z inflating: build/bin/protoc 2024-08-06T20:52:48.1677307Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_1_var_test 2024-08-06T20:52:48.1730900Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_stream 2024-08-06T20:52:48.1783334Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_from_2_processes 2024-08-06T20:52:48.1836987Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_thread_and_block_and_device 2024-08-06T20:52:48.1889733Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_multiple_blocks 2024-08-06T20:52:48.1944457Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_blocks_and_threads 2024-08-06T20:52:48.1994906Z inflating: build/bin/c10_cuda_CUDATest 2024-08-06T20:52:48.2047843Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_same_block 2024-08-06T20:52:48.2379526Z inflating: build/bin/vec_test_all_types_DEFAULT 2024-08-06T20:52:48.2725428Z inflating: build/bin/vec_test_all_types_AVX512 2024-08-06T20:52:48.3087940Z inflating: build/bin/vec_test_all_types_AVX2 2024-08-06T20:52:48.3140762Z inflating: build/bin/BackoffTest 2024-08-06T20:52:48.3194424Z inflating: build/bin/FileStoreTest 2024-08-06T20:52:48.3250406Z inflating: build/bin/TCPStoreTest 2024-08-06T20:52:48.3304535Z inflating: build/bin/HashStoreTest 2024-08-06T20:52:48.3318236Z inflating: build/bin/ProcessGroupMPITest 2024-08-06T20:52:48.3373281Z inflating: build/bin/test_edge_op_registration 2024-08-06T20:52:48.3377858Z inflating: build/bin/torch_shm_manager 2024-08-06T20:52:48.3381270Z inflating: build/bin/example_allreduce 2024-08-06T20:52:48.3436007Z inflating: build/bin/test_dist_autograd 2024-08-06T20:52:48.3505405Z inflating: build/bin/test_cpp_rpc 2024-08-06T20:52:48.3508148Z inflating: build/bin/parallel_benchmark 2024-08-06T20:52:48.3575498Z inflating: build/bin/test_mobile_nnc 2024-08-06T20:52:48.3584675Z inflating: build/bin/aot_model_compiler_test 2024-08-06T20:52:48.3933448Z inflating: build/bin/test_lazy 2024-08-06T20:52:48.5121480Z inflating: build/bin/test_api 2024-08-06T20:52:48.5195023Z inflating: build/bin/Dict_test 2024-08-06T20:52:48.5247474Z inflating: build/bin/Dimname_test 2024-08-06T20:52:48.5312596Z inflating: build/bin/MaybeOwned_test 2024-08-06T20:52:48.5370180Z inflating: build/bin/NamedTensor_test 2024-08-06T20:52:48.5430204Z inflating: build/bin/apply_utils_test 2024-08-06T20:52:48.5489363Z inflating: build/bin/atest 2024-08-06T20:52:48.5553149Z inflating: build/bin/basic 2024-08-06T20:52:48.5608692Z inflating: build/bin/broadcast_test 2024-08-06T20:52:48.5659760Z inflating: build/bin/cpu_allocator_test 2024-08-06T20:52:48.5718931Z inflating: build/bin/cpu_generator_test 2024-08-06T20:52:48.5773296Z inflating: build/bin/cpu_profiling_allocator_test 2024-08-06T20:52:48.5867373Z inflating: build/bin/cpu_rng_test 2024-08-06T20:52:48.5917625Z inflating: build/bin/dispatch_key_set_test 2024-08-06T20:52:48.5969979Z inflating: build/bin/dlconvertor_test 2024-08-06T20:52:48.6028252Z inflating: build/bin/extension_backend_test 2024-08-06T20:52:48.6083717Z inflating: build/bin/half_test 2024-08-06T20:52:48.6180117Z inflating: build/bin/ivalue_test 2024-08-06T20:52:48.6230795Z inflating: build/bin/lazy_tensor_test 2024-08-06T20:52:48.6285954Z inflating: build/bin/math_kernel_test 2024-08-06T20:52:48.6340211Z inflating: build/bin/memory_format_test 2024-08-06T20:52:48.6393795Z inflating: build/bin/memory_overlapping_test 2024-08-06T20:52:48.6447672Z inflating: build/bin/mobile_memory_cleanup 2024-08-06T20:52:48.6504083Z inflating: build/bin/native_test 2024-08-06T20:52:48.6554864Z inflating: build/bin/operator_name_test 2024-08-06T20:52:48.6606942Z inflating: build/bin/operators_test 2024-08-06T20:52:48.6659260Z inflating: build/bin/packedtensoraccessor_test 2024-08-06T20:52:48.6727183Z inflating: build/bin/pow_test 2024-08-06T20:52:48.6784531Z inflating: build/bin/quantized_test 2024-08-06T20:52:48.6835520Z inflating: build/bin/reduce_ops_test 2024-08-06T20:52:48.6886493Z inflating: build/bin/reportMemoryUsage_test 2024-08-06T20:52:48.6944645Z inflating: build/bin/scalar_tensor_test 2024-08-06T20:52:48.7003814Z inflating: build/bin/scalar_test 2024-08-06T20:52:48.7055798Z inflating: build/bin/StorageUtils_test 2024-08-06T20:52:48.7110440Z inflating: build/bin/stride_properties_test 2024-08-06T20:52:48.7189709Z inflating: build/bin/tensor_iterator_test 2024-08-06T20:52:48.7245064Z inflating: build/bin/test_parallel 2024-08-06T20:52:48.7248073Z inflating: build/bin/thread_init_test 2024-08-06T20:52:48.7304130Z inflating: build/bin/type_ptr_test 2024-08-06T20:52:48.7364711Z inflating: build/bin/type_test 2024-08-06T20:52:48.7417385Z inflating: build/bin/undefined_tensor_test 2024-08-06T20:52:48.7419033Z inflating: build/bin/verify_api_visibility 2024-08-06T20:52:48.7489304Z inflating: build/bin/legacy_vmap_test 2024-08-06T20:52:48.7541669Z inflating: build/bin/weakref_test 2024-08-06T20:52:48.7593375Z inflating: build/bin/wrapdim_test 2024-08-06T20:52:48.7654940Z inflating: build/bin/IListRef_test 2024-08-06T20:52:48.7706713Z inflating: build/bin/xla_tensor_test 2024-08-06T20:52:48.7813724Z inflating: build/bin/List_test 2024-08-06T20:52:48.7881203Z inflating: build/bin/KernelFunction_test 2024-08-06T20:52:48.8004782Z inflating: build/bin/kernel_function_legacy_test 2024-08-06T20:52:48.8103686Z inflating: build/bin/kernel_function_test 2024-08-06T20:52:48.8233886Z inflating: build/bin/kernel_lambda_legacy_test 2024-08-06T20:52:48.8338047Z inflating: build/bin/kernel_lambda_test 2024-08-06T20:52:48.8401402Z inflating: build/bin/kernel_stackbased_test 2024-08-06T20:52:48.8500115Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2024-08-06T20:52:48.8551444Z inflating: build/bin/CppSignature_test 2024-08-06T20:52:48.8608053Z inflating: build/bin/backend_fallback_test 2024-08-06T20:52:48.8657585Z inflating: build/bin/op_allowlist_test 2024-08-06T20:52:48.8722359Z inflating: build/bin/inline_container_test 2024-08-06T20:52:48.9027855Z inflating: build/bin/op_registration_test 2024-08-06T20:52:48.9081013Z inflating: build/bin/cuda_apply_test 2024-08-06T20:52:48.9134153Z inflating: build/bin/cuda_allocator_test 2024-08-06T20:52:48.9193291Z inflating: build/bin/cuda_atomic_ops_test 2024-08-06T20:52:48.9248138Z inflating: build/bin/cuda_caching_host_allocator_test 2024-08-06T20:52:48.9319029Z inflating: build/bin/cuda_complex_math_test 2024-08-06T20:52:48.9378668Z inflating: build/bin/cuda_complex_test 2024-08-06T20:52:48.9430435Z inflating: build/bin/cuda_device_test 2024-08-06T20:52:48.9488956Z inflating: build/bin/cuda_cub_test 2024-08-06T20:52:48.9540445Z inflating: build/bin/cuda_dlconvertor_test 2024-08-06T20:52:48.9605584Z inflating: build/bin/cuda_distributions_test 2024-08-06T20:52:48.9662562Z inflating: build/bin/cuda_generator_test 2024-08-06T20:52:48.9713071Z inflating: build/bin/cuda_half_test 2024-08-06T20:52:48.9765550Z inflating: build/bin/cuda_integer_divider_test 2024-08-06T20:52:48.9815500Z inflating: build/bin/cuda_optional_test 2024-08-06T20:52:48.9867246Z inflating: build/bin/cuda_packedtensoraccessor_test 2024-08-06T20:52:48.9920651Z inflating: build/bin/cuda_reportMemoryUsage_test 2024-08-06T20:52:48.9971021Z inflating: build/bin/cuda_allocatorTraceTracker_test 2024-08-06T20:52:49.0032535Z inflating: build/bin/cuda_stream_test 2024-08-06T20:52:49.0082382Z inflating: build/bin/cuda_cudnn_test 2024-08-06T20:52:49.0135038Z inflating: build/bin/cuda_vectorized_test 2024-08-06T20:52:49.0149646Z inflating: build/bin/tutorial_tensorexpr 2024-08-06T20:52:49.0216535Z inflating: build/bin/ProcessGroupGlooTest 2024-08-06T20:52:49.0274655Z inflating: build/bin/ProcessGroupGlooAsyncTest 2024-08-06T20:52:49.0338752Z inflating: build/bin/ProcessGroupNCCLTest 2024-08-06T20:52:49.0401308Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2024-08-06T20:52:49.1236077Z inflating: build/bin/test_tensorexpr 2024-08-06T20:52:49.1825053Z inflating: build/bin/test_jit 2024-08-06T20:52:49.1825347Z creating: .additional_ci_files/ 2024-08-06T20:52:49.1884219Z inflating: .additional_ci_files/test-times.json 2024-08-06T20:52:49.2117517Z inflating: .additional_ci_files/test-class-times.json 2024-08-06T20:52:49.2153398Z ##[group]Run rm artifacts.zip 2024-08-06T20:52:49.2153691Z rm artifacts.zip 2024-08-06T20:52:49.2163421Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:49.2163761Z env: 2024-08-06T20:52:49.2163958Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:49.2164253Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:49.2164575Z ##[endgroup] 2024-08-06T20:52:49.3270093Z ##[group]Run df -H 2024-08-06T20:52:49.3270323Z df -H 2024-08-06T20:52:49.3278384Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:49.3278740Z env: 2024-08-06T20:52:49.3278930Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:49.3279235Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:49.3279560Z ##[endgroup] 2024-08-06T20:52:49.3327577Z Filesystem Size Used Avail Use% Mounted on 2024-08-06T20:52:49.3328018Z devtmpfs 4.2M 0 4.2M 0% /dev 2024-08-06T20:52:49.3328357Z tmpfs 34G 25k 34G 1% /dev/shm 2024-08-06T20:52:49.3328659Z tmpfs 14G 553k 14G 1% /run 2024-08-06T20:52:49.3329190Z /dev/nvme0n1p1 161G 38G 124G 24% / 2024-08-06T20:52:49.3329481Z tmpfs 34G 33k 34G 1% /tmp 2024-08-06T20:52:49.3329782Z /dev/nvme0n1p128 11M 1.4M 9.2M 13% /boot/efi 2024-08-06T20:52:49.3330098Z tmpfs 6.7G 0 6.7G 0% /run/user/0 2024-08-06T20:52:49.3364615Z Prepare all required actions 2024-08-06T20:52:49.3364946Z Getting action download info 2024-08-06T20:52:49.4606127Z ##[group]Run ./.github/actions/download-td-artifacts 2024-08-06T20:52:49.4606461Z with: 2024-08-06T20:52:49.4606647Z env: 2024-08-06T20:52:49.4606844Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:49.4607153Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:49.4607512Z ##[endgroup] 2024-08-06T20:52:49.4658659Z ##[group]Run seemethere/download-artifact-s3@v4 2024-08-06T20:52:49.4658973Z with: 2024-08-06T20:52:49.4659159Z name: td_results 2024-08-06T20:52:49.4659390Z s3-bucket: gha-artifacts 2024-08-06T20:52:49.4659634Z region: us-east-1 2024-08-06T20:52:49.4659836Z env: 2024-08-06T20:52:49.4660034Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:49.4660341Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:49.4660659Z ##[endgroup] 2024-08-06T20:52:49.9439502Z (node:557260) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2024-08-06T20:52:49.9440098Z 2024-08-06T20:52:49.9440291Z Please migrate your code to use AWS SDK for JavaScript (v3). 2024-08-06T20:52:49.9440770Z For more information, check the migration guide at https://a.co/7PzMCcy 2024-08-06T20:52:49.9441288Z (Use `node --trace-warnings ...` to show where the warning was created) 2024-08-06T20:52:50.0227135Z Found 1 objects with prefix pytorch/pytorch/10273124344/td_results/ 2024-08-06T20:52:50.0227701Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2024-08-06T20:52:50.1197851Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2024-08-06T20:52:50.1203910Z Artifact download has finished successfully 2024-08-06T20:52:50.1533204Z ##[group]Run mkdir -p .additional_ci_files 2024-08-06T20:52:50.1533540Z mkdir -p .additional_ci_files 2024-08-06T20:52:50.1533916Z mv td_results.json .additional_ci_files/td_results.json 2024-08-06T20:52:50.1544075Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:50.1544414Z env: 2024-08-06T20:52:50.1544606Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:50.1545101Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:50.1545425Z ##[endgroup] 2024-08-06T20:52:50.1638401Z ##[group]Run .github/scripts/parse_ref.py 2024-08-06T20:52:50.1638760Z .github/scripts/parse_ref.py 2024-08-06T20:52:50.1647273Z shell: /usr/bin/bash -e {0} 2024-08-06T20:52:50.1647522Z env: 2024-08-06T20:52:50.1647716Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:50.1648012Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:50.1648340Z ##[endgroup] 2024-08-06T20:52:50.1920006Z Prepare all required actions 2024-08-06T20:52:50.1958201Z ##[group]Run ./.github/actions/get-workflow-job-id 2024-08-06T20:52:50.1958501Z with: 2024-08-06T20:52:50.1958859Z github-token: *** 2024-08-06T20:52:50.1959077Z env: 2024-08-06T20:52:50.1959273Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:50.1959566Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:50.1959890Z ##[endgroup] 2024-08-06T20:52:50.1981632Z ##[group]Run set -eux 2024-08-06T20:52:50.1981890Z set -eux 2024-08-06T20:52:50.1982276Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2024-08-06T20:52:50.1990830Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:50.1991166Z env: 2024-08-06T20:52:50.1991609Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:50.1991910Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:50.1992768Z GITHUB_TOKEN: *** 2024-08-06T20:52:50.1993015Z ##[endgroup] 2024-08-06T20:52:50.2019779Z + python3 .github/scripts/get_workflow_job_id.py 10273124344 i-0756c52faebfec338 2024-08-06T20:52:54.3555421Z setting job-id=28428649080 2024-08-06T20:52:54.3556025Z setting job-name=linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:52:54.3722211Z Prepare all required actions 2024-08-06T20:52:54.3722567Z Getting action download info 2024-08-06T20:52:54.4722980Z ##[group]Run ./.github/actions/filter-test-configs 2024-08-06T20:52:54.4723304Z with: 2024-08-06T20:52:54.4723684Z github-token: *** 2024-08-06T20:52:54.4725199Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]} 2024-08-06T20:52:54.4727003Z job-name: linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:52:54.4727502Z env: 2024-08-06T20:52:54.4727735Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:54.4728042Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:54.4728357Z ##[endgroup] 2024-08-06T20:52:54.4768072Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2024-08-06T20:52:54.4768439Z with: 2024-08-06T20:52:54.4768624Z shell: bash 2024-08-06T20:52:54.4768823Z timeout_minutes: 10 2024-08-06T20:52:54.4769036Z max_attempts: 5 2024-08-06T20:52:54.4769247Z retry_wait_seconds: 30 2024-08-06T20:52:54.4769938Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2024-08-06T20:52:54.4770655Z polling_interval_seconds: 1 2024-08-06T20:52:54.4770905Z warning_on_retry: true 2024-08-06T20:52:54.4771145Z continue_on_error: false 2024-08-06T20:52:54.4771367Z env: 2024-08-06T20:52:54.4771559Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:54.4771857Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:54.4772495Z GITHUB_TOKEN: *** 2024-08-06T20:52:54.4772711Z ##[endgroup] 2024-08-06T20:52:54.5533448Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2024-08-06T20:52:54.7910327Z Defaulting to user installation because normal site-packages is not writeable 2024-08-06T20:52:54.8087027Z Requirement already satisfied: requests==2.27.1 in /home/ec2-user/.local/lib/python3.9/site-packages (2.27.1) 2024-08-06T20:52:54.8091104Z Requirement already satisfied: pyyaml==6.0.1 in /home/ec2-user/.local/lib/python3.9/site-packages (6.0.1) 2024-08-06T20:52:54.8208306Z Requirement already satisfied: charset-normalizer~=2.0.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from requests==2.27.1) (2.0.12) 2024-08-06T20:52:54.8216622Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (2.10) 2024-08-06T20:52:54.8220467Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (1.25.10) 2024-08-06T20:52:54.8225685Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.9/site-packages (from requests==2.27.1) (2024.7.4) 2024-08-06T20:52:55.5310485Z Command completed after 1 attempt(s). 2024-08-06T20:52:55.5372700Z ##[group]Run set -x 2024-08-06T20:52:55.5372938Z set -x 2024-08-06T20:52:55.5373140Z  2024-08-06T20:52:55.5373482Z # Use relative path here as this could be checked out anywhere, not necessarily 2024-08-06T20:52:55.5373913Z # in runner workspace 2024-08-06T20:52:55.5374258Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2024-08-06T20:52:55.5383796Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:55.5384135Z env: 2024-08-06T20:52:55.5384329Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:55.5384628Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:55.5384944Z ##[endgroup] 2024-08-06T20:52:55.5413178Z + python3 /home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2024-08-06T20:52:55.5659034Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2024-08-06T20:52:55.5659454Z echo "Workflow: ${GITHUB_WORKFLOW}" 2024-08-06T20:52:55.5659797Z echo "Job name: ${JOB_NAME}" 2024-08-06T20:52:55.5660101Z  2024-08-06T20:52:55.5660503Z # Use relative path here as this could be checked out anywhere, not necessarily 2024-08-06T20:52:55.5661002Z # in runner workspace 2024-08-06T20:52:55.5661451Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2024-08-06T20:52:55.5661955Z  --workflow "${GITHUB_WORKFLOW}" \ 2024-08-06T20:52:55.5662295Z  --job-name "${JOB_NAME}" \ 2024-08-06T20:52:55.5664336Z  --test-matrix "{"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]}" \ 2024-08-06T20:52:55.5666399Z  --selected-test-configs "" \ 2024-08-06T20:52:55.5666746Z  --pr-number "${PR_NUMBER}" \ 2024-08-06T20:52:55.5667072Z  --tag "${TAG}" \ 2024-08-06T20:52:55.5667362Z  --event-name "${EVENT_NAME}" \ 2024-08-06T20:52:55.5667665Z  --schedule "${SCHEDULE}" \ 2024-08-06T20:52:55.5667958Z  --branch "${HEAD_BRANCH}" 2024-08-06T20:52:55.5676538Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:55.5676875Z env: 2024-08-06T20:52:55.5677074Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:55.5677398Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:55.5677903Z GITHUB_TOKEN: *** 2024-08-06T20:52:55.5678368Z JOB_NAME: linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:52:55.5679065Z PR_NUMBER: 132710 2024-08-06T20:52:55.5679281Z TAG: 2024-08-06T20:52:55.5679483Z EVENT_NAME: pull_request 2024-08-06T20:52:55.5679722Z SCHEDULE: 2024-08-06T20:52:55.5679928Z HEAD_BRANCH: 2024-08-06T20:52:55.5680141Z ##[endgroup] 2024-08-06T20:52:55.5709825Z Workflow: pull 2024-08-06T20:52:55.5710301Z Job name: linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:52:55.8078645Z INFO:root:Found no test-config label on the PR, so all test configs are included 2024-08-06T20:52:55.9479255Z ##[group]Run echo "Filtered matrix:" 2024-08-06T20:52:55.9479572Z echo "Filtered matrix:" 2024-08-06T20:52:55.9481148Z echo "{"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]}" 2024-08-06T20:52:55.9482713Z  2024-08-06T20:52:55.9482902Z echo 2024-08-06T20:52:55.9483151Z echo "Is the current job unstable? False" 2024-08-06T20:52:55.9483457Z  2024-08-06T20:52:55.9483647Z echo 2024-08-06T20:52:55.9483880Z echo "Is keep-going label set? False" 2024-08-06T20:52:55.9484172Z  2024-08-06T20:52:55.9484353Z echo 2024-08-06T20:52:55.9484575Z echo "Renabled issues? " 2024-08-06T20:52:55.9493973Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:55.9494462Z env: 2024-08-06T20:52:55.9494667Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:55.9494979Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:55.9495303Z ##[endgroup] 2024-08-06T20:52:55.9522847Z Filtered matrix: 2024-08-06T20:52:55.9524733Z {include: [{config: default, shard: 1, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 2, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 3, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 4, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 5, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}]} 2024-08-06T20:52:55.9526575Z 2024-08-06T20:52:55.9526680Z Is the current job unstable? False 2024-08-06T20:52:55.9526901Z 2024-08-06T20:52:55.9527013Z Is keep-going label set? False 2024-08-06T20:52:55.9527203Z 2024-08-06T20:52:55.9527310Z Renabled issues? 2024-08-06T20:52:55.9571293Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2024-08-06T20:52:55.9571754Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2024-08-06T20:52:55.9580506Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T20:52:55.9580845Z env: 2024-08-06T20:52:55.9581036Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:55.9581342Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:55.9581667Z JOB_TIMEOUT: 240 2024-08-06T20:52:55.9581872Z ##[endgroup] 2024-08-06T20:52:55.9655093Z ##[group]Run set -x 2024-08-06T20:52:55.9655374Z set -x 2024-08-06T20:52:55.9655566Z  2024-08-06T20:52:55.9655795Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2024-08-06T20:52:55.9656148Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2024-08-06T20:52:55.9656494Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2024-08-06T20:52:55.9656816Z  TEST_COMMAND=.ci/onnx/test.sh 2024-08-06T20:52:55.9657108Z else 2024-08-06T20:52:55.9657354Z  TEST_COMMAND=.ci/pytorch/test.sh 2024-08-06T20:52:55.9657628Z fi 2024-08-06T20:52:55.9657813Z  2024-08-06T20:52:55.9658288Z # detached container should get cleaned up by teardown_ec2_linux 2024-08-06T20:52:55.9658764Z # TODO: Stop building test binaries as part of the build phase 2024-08-06T20:52:55.9659179Z # Used for GPU_FLAG since that doesn't play nice 2024-08-06T20:52:55.9659541Z # shellcheck disable=SC2086,SC2090 2024-08-06T20:52:55.9659839Z container_name=$(docker run \ 2024-08-06T20:52:55.9660117Z  ${GPU_FLAG:-} \ 2024-08-06T20:52:55.9660369Z  -e BUILD_ENVIRONMENT \ 2024-08-06T20:52:55.9660630Z  -e PR_NUMBER \ 2024-08-06T20:52:55.9660875Z  -e GITHUB_ACTIONS \ 2024-08-06T20:52:55.9661135Z  -e GITHUB_REPOSITORY \ 2024-08-06T20:52:55.9661392Z  -e GITHUB_WORKFLOW \ 2024-08-06T20:52:55.9661647Z  -e GITHUB_JOB \ 2024-08-06T20:52:55.9661892Z  -e GITHUB_RUN_ID \ 2024-08-06T20:52:55.9662136Z  -e GITHUB_RUN_NUMBER \ 2024-08-06T20:52:55.9662400Z  -e GITHUB_RUN_ATTEMPT \ 2024-08-06T20:52:55.9662669Z  -e JOB_ID \ 2024-08-06T20:52:55.9662890Z  -e JOB_NAME \ 2024-08-06T20:52:55.9663123Z  -e BASE_SHA \ 2024-08-06T20:52:55.9663354Z  -e BRANCH \ 2024-08-06T20:52:55.9663570Z  -e SHA1 \ 2024-08-06T20:52:55.9663802Z  -e AWS_DEFAULT_REGION \ 2024-08-06T20:52:55.9664066Z  -e IN_WHEEL_TEST \ 2024-08-06T20:52:55.9664304Z  -e SHARD_NUMBER \ 2024-08-06T20:52:55.9664552Z  -e TEST_CONFIG \ 2024-08-06T20:52:55.9664795Z  -e NUM_TEST_SHARDS \ 2024-08-06T20:52:55.9665051Z  -e REENABLED_ISSUES \ 2024-08-06T20:52:55.9665322Z  -e CONTINUE_THROUGH_ERROR \ 2024-08-06T20:52:55.9665599Z  -e VERBOSE_TEST_LOGS \ 2024-08-06T20:52:55.9665855Z  -e TEST_SHOWLOCALS \ 2024-08-06T20:52:55.9666109Z  -e NO_TEST_TIMEOUT \ 2024-08-06T20:52:55.9666358Z  -e NO_TD \ 2024-08-06T20:52:55.9666580Z  -e TD_DISTRIBUTED \ 2024-08-06T20:52:55.9666842Z  -e PR_LABELS \ 2024-08-06T20:52:55.9667107Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2024-08-06T20:52:55.9667397Z  -e SCCACHE_BUCKET \ 2024-08-06T20:52:55.9667656Z  -e SCCACHE_S3_KEY_PREFIX \ 2024-08-06T20:52:55.9667926Z  -e XLA_CUDA \ 2024-08-06T20:52:55.9668183Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2024-08-06T20:52:55.9668508Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2024-08-06T20:52:55.9668847Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2024-08-06T20:52:55.9669170Z  -e SKIP_SCCACHE_INITIALIZATION=1 \ 2024-08-06T20:52:55.9669477Z  -e HUGGING_FACE_HUB_TOKEN \ 2024-08-06T20:52:55.9669763Z  -e DASHBOARD_TAG \ 2024-08-06T20:52:55.9670063Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2024-08-06T20:52:55.9670424Z  --security-opt seccomp=unconfined \ 2024-08-06T20:52:55.9670733Z  --cap-add=SYS_PTRACE \ 2024-08-06T20:52:55.9671000Z  --ipc=host \ 2024-08-06T20:52:55.9671247Z  --shm-size="${SHM_SIZE}" \ 2024-08-06T20:52:55.9671515Z  --tty \ 2024-08-06T20:52:55.9671723Z  --detach \ 2024-08-06T20:52:55.9671965Z  --name="${container_name}" \ 2024-08-06T20:52:55.9672383Z  --user jenkins \ 2024-08-06T20:52:55.9672696Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2024-08-06T20:52:55.9673053Z  -w /var/lib/jenkins/workspace \ 2024-08-06T20:52:55.9673340Z  "${DOCKER_IMAGE}" 2024-08-06T20:52:55.9673566Z ) 2024-08-06T20:52:55.9673827Z # Propagate download.pytorch.org IP to container 2024-08-06T20:52:55.9674418Z grep download.pytorch.org /etc/hosts | docker exec -i "${container_name}" sudo bash -c "/bin/cat >> /etc/hosts" 2024-08-06T20:52:55.9675036Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2024-08-06T20:52:55.9675618Z docker exec -t "${container_name}" sh -c "pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2024-08-06T20:52:55.9683695Z shell: /usr/bin/bash -e {0} 2024-08-06T20:52:55.9683933Z env: 2024-08-06T20:52:55.9684132Z GIT_DEFAULT_BRANCH: main 2024-08-06T20:52:55.9684429Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:52:55.9684838Z BUILD_ENVIRONMENT: linux-focal-cuda12.1-py3.10-gcc9-sm86 2024-08-06T20:52:55.9685168Z PR_NUMBER: 132710 2024-08-06T20:52:55.9685400Z GITHUB_REPOSITORY: pytorch/pytorch 2024-08-06T20:52:55.9685668Z GITHUB_WORKFLOW: pull 2024-08-06T20:52:55.9685895Z GITHUB_JOB: test 2024-08-06T20:52:55.9686113Z GITHUB_RUN_ID: 10273124344 2024-08-06T20:52:55.9686351Z GITHUB_RUN_NUMBER: 233985 2024-08-06T20:52:55.9686589Z GITHUB_RUN_ATTEMPT: 1 2024-08-06T20:52:55.9686812Z JOB_ID: 28428649080 2024-08-06T20:52:55.9687290Z JOB_NAME: linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:52:55.9687826Z BRANCH: pull/132710 2024-08-06T20:52:55.9688079Z SHA1: b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-06T20:52:55.9688417Z BASE_SHA: 1736af7cf736184c356be1bb00f59fb2feea6d7d 2024-08-06T20:52:55.9688715Z TEST_CONFIG: default 2024-08-06T20:52:55.9688944Z SHARD_NUMBER: 2 2024-08-06T20:52:55.9689148Z NUM_TEST_SHARDS: 5 2024-08-06T20:52:55.9689371Z REENABLED_ISSUES: 2024-08-06T20:52:55.9689599Z CONTINUE_THROUGH_ERROR: False 2024-08-06T20:52:55.9689849Z VERBOSE_TEST_LOGS: False 2024-08-06T20:52:55.9690092Z TEST_SHOWLOCALS: False 2024-08-06T20:52:55.9690325Z NO_TEST_TIMEOUT: False 2024-08-06T20:52:55.9690541Z NO_TD: False 2024-08-06T20:52:55.9690747Z TD_DISTRIBUTED: False 2024-08-06T20:52:55.9691028Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2024-08-06T20:52:55.9691349Z SCCACHE_S3_KEY_PREFIX: pull 2024-08-06T20:52:55.9691594Z SHM_SIZE: 2g 2024-08-06T20:52:55.9692532Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:52:55.9693232Z XLA_CUDA: 2024-08-06T20:52:55.9693556Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2024-08-06T20:52:55.9693964Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2024-08-06T20:52:55.9694254Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2024-08-06T20:52:55.9694594Z DASHBOARD_TAG: 2024-08-06T20:52:55.9694814Z HUGGING_FACE_HUB_TOKEN: 2024-08-06T20:52:55.9695039Z ##[endgroup] 2024-08-06T20:52:55.9722503Z + [[ default == \m\u\l\t\i\g\p\u ]] 2024-08-06T20:52:55.9722881Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *onnx* ]] 2024-08-06T20:52:55.9723222Z + TEST_COMMAND=.ci/pytorch/test.sh 2024-08-06T20:52:55.9731864Z +++ nproc --ignore=2 2024-08-06T20:52:55.9747324Z ++ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e TD_DISTRIBUTED -e PR_LABELS -e MAX_JOBS=14 -e SCCACHE_BUCKET -e SCCACHE_S3_KEY_PREFIX -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e SKIP_SCCACHE_INITIALIZATION=1 -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/tmp/github_env_10273124344 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T20:53:07.3295212Z + container_name=88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T20:53:07.3301209Z + grep download.pytorch.org /etc/hosts 2024-08-06T20:53:07.3303303Z + docker exec -i 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 sudo bash -c '/bin/cat >> /etc/hosts' 2024-08-06T20:53:07.4761137Z + echo DOCKER_CONTAINER_ID=88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T20:53:07.4765760Z ++ echo dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl 2024-08-06T20:53:07.4768849Z + docker exec -t 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 sh -c 'pip install dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl[opt-einsum] && .ci/pytorch/test.sh' 2024-08-06T20:53:07.8946741Z Processing ./dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl (from torch==2.5.0a0+gitb9d86fa) 2024-08-06T20:53:08.2155846Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (3.13.1) 2024-08-06T20:53:08.2159619Z Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (4.12.2) 2024-08-06T20:53:08.2163115Z Requirement already satisfied: networkx in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (2.8.8) 2024-08-06T20:53:08.2165975Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (3.1.4) 2024-08-06T20:53:08.2169013Z Requirement already satisfied: fsspec in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (2024.6.1) 2024-08-06T20:53:08.2178294Z Requirement already satisfied: sympy>=1.13.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (1.13.1) 2024-08-06T20:53:08.2206027Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (3.3.0) 2024-08-06T20:53:08.2262284Z Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (1.21.2) 2024-08-06T20:53:08.2296260Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.0->torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (1.3.0) 2024-08-06T20:53:08.3185956Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (2.1.5) 2024-08-06T20:53:09.1237468Z Installing collected packages: torch 2024-08-06T20:53:19.6505250Z Successfully installed torch-2.5.0a0+gitb9d86fa 2024-08-06T20:53:19.7282960Z ++ dirname .ci/pytorch/test.sh 2024-08-06T20:53:19.7293494Z + source .ci/pytorch/common.sh 2024-08-06T20:53:19.7296558Z +++ dirname .ci/pytorch/common.sh 2024-08-06T20:53:19.7307290Z ++ source .ci/pytorch/common_utils.sh 2024-08-06T20:53:19.7308782Z +++ declare -f -t trap_add 2024-08-06T20:53:19.7314625Z ++ set -ex 2024-08-06T20:53:19.7315020Z ++ [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *rocm* ]] 2024-08-06T20:53:19.7315740Z ++ BUILD_TEST_LIBTORCH=0 2024-08-06T20:53:19.7316265Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-06T20:53:19.7319447Z ++ stat -c %u /var/lib/jenkins/workspace 2024-08-06T20:53:19.7338915Z + WORKSPACE_ORIGINAL_OWNER_ID=1000 2024-08-06T20:53:19.7339302Z + trap_add cleanup_workspace EXIT 2024-08-06T20:53:19.7339596Z + trap_add_cmd=cleanup_workspace 2024-08-06T20:53:19.7339854Z + shift 2024-08-06T20:53:19.7340046Z + for trap_add_name in "$@" 2024-08-06T20:53:19.7347609Z +++ trap -p EXIT 2024-08-06T20:53:19.7350638Z ++ eval 'extract_trap_cmd ' 2024-08-06T20:53:19.7350990Z +++ extract_trap_cmd 2024-08-06T20:53:19.7351227Z +++ printf '%s\n' '' 2024-08-06T20:53:19.7351772Z ++ printf '%s\n' cleanup_workspace 2024-08-06T20:53:19.7355441Z + trap -- ' 2024-08-06T20:53:19.7355883Z cleanup_workspace' EXIT 2024-08-06T20:53:19.7356192Z + sudo chown -R jenkins /var/lib/jenkins/workspace 2024-08-06T20:53:20.4587670Z + git config --global --add safe.directory /var/lib/jenkins/workspace 2024-08-06T20:53:20.4610885Z + echo 'Environment variables:' 2024-08-06T20:53:20.4611213Z Environment variables: 2024-08-06T20:53:20.4611534Z + env 2024-08-06T20:53:20.4622442Z INSTALLED_DB=yes 2024-08-06T20:53:20.4622751Z NV_LIBCUBLAS_VERSION=12.1.3.1-1 2024-08-06T20:53:20.4623040Z NVIDIA_VISIBLE_DEVICES=all 2024-08-06T20:53:20.4623380Z NV_NVML_DEV_VERSION=12.1.105-1 2024-08-06T20:53:20.4623837Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-06T20:53:20.4624233Z CONTINUE_THROUGH_ERROR=False 2024-08-06T20:53:20.4624534Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.17.1-1+cuda12.1 2024-08-06T20:53:20.4624869Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.17.1-1 2024-08-06T20:53:20.4625245Z BUILD_ENVIRONMENT=linux-focal-cuda12.1-py3.10-gcc9-sm86 2024-08-06T20:53:20.4625590Z HOSTNAME=88943b22963e 2024-08-06T20:53:20.4626110Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4626659Z GITHUB_ACTION=__self 2024-08-06T20:53:20.4626924Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2024-08-06T20:53:20.4629593Z NVIDIA_REQUIRE_CUDA=cuda>=12.1 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=tesla,driver>=525,driver<526 brand=unknown,driver>=525,driver<526 brand=nvidia,driver>=525,driver<526 brand=nvidiartx,driver>=525,driver<526 brand=geforce,driver>=525,driver<526 brand=geforcertx,driver>=525,driver<526 brand=quadro,driver>=525,driver<526 brand=quadrortx,driver>=525,driver<526 brand=titan,driver>=525,driver<526 brand=titanrtx,driver>=525,driver<526 2024-08-06T20:53:20.4632306Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-12-1=12.1.3.1-1 2024-08-06T20:53:20.4632642Z NV_NVTX_VERSION=12.1.105-1 2024-08-06T20:53:20.4632884Z GITHUB_RUN_NUMBER=233985 2024-08-06T20:53:20.4633113Z TEST_CONFIG=default 2024-08-06T20:53:20.4633344Z GITHUB_REPOSITORY_OWNER_ID=21003710 2024-08-06T20:53:20.4633635Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2024-08-06T20:53:20.4633933Z NV_CUDA_CUDART_DEV_VERSION=12.1.105-1 2024-08-06T20:53:20.4634216Z NV_LIBCUSPARSE_VERSION=12.1.0.106-1 2024-08-06T20:53:20.4634482Z NV_LIBNPP_VERSION=12.1.0.40-1 2024-08-06T20:53:20.4634745Z GITHUB_TRIGGERING_ACTOR=drisspg 2024-08-06T20:53:20.4635049Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2024-08-06T20:53:20.4635358Z GITHUB_REF_TYPE=branch 2024-08-06T20:53:20.4635591Z TORCH_CUDA_ARCH_LIST=Maxwell 2024-08-06T20:53:20.4635840Z NCCL_VERSION=2.17.1-1 2024-08-06T20:53:20.4636101Z BASE_SHA=1736af7cf736184c356be1bb00f59fb2feea6d7d 2024-08-06T20:53:20.4636395Z XLA_CUDA= 2024-08-06T20:53:20.4636590Z HUGGING_FACE_HUB_TOKEN= 2024-08-06T20:53:20.4637011Z *** 2024-08-06T20:53:20.4637212Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2024-08-06T20:53:20.4637489Z GITHUB_REPOSITORY_ID=65600975 2024-08-06T20:53:20.4638057Z GITHUB_ACTIONS=true 2024-08-06T20:53:20.4638291Z NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:53:20.4638588Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-12-1=12.1.105-1 2024-08-06T20:53:20.4638917Z NV_LIBNPP_PACKAGE=libnpp-12-1=12.1.0.40-1 2024-08-06T20:53:20.4639229Z SHA1=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-06T20:53:20.4639543Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2024-08-06T20:53:20.4639861Z GITHUB_SHA=bf5bb5a1585a03379137fab341e87c02c77e76cd 2024-08-06T20:53:20.4640341Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/pull.yml@refs/pull/132710/merge 2024-08-06T20:53:20.4640773Z UCC_HOME=/usr 2024-08-06T20:53:20.4640981Z NV_LIBCUBLAS_DEV_VERSION=12.1.3.1-1 2024-08-06T20:53:20.4641418Z VERBOSE_TEST_LOGS=False 2024-08-06T20:53:20.4641644Z NVIDIA_PRODUCT_NAME=CUDA 2024-08-06T20:53:20.4641919Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-12-1 2024-08-06T20:53:20.4642237Z GITHUB_REF=refs/pull/132710/merge 2024-08-06T20:53:20.4642495Z NV_CUDA_CUDART_VERSION=12.1.105-1 2024-08-06T20:53:20.4642757Z SHARD_NUMBER=2 2024-08-06T20:53:20.4642967Z GITHUB_REF_PROTECTED=false 2024-08-06T20:53:20.4643203Z HOME=/var/lib/jenkins 2024-08-06T20:53:20.4643457Z GITHUB_API_URL=https://api.github.com 2024-08-06T20:53:20.4643753Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2024-08-06T20:53:20.4644058Z UCX_COMMIT=7bb2722ff2187a0cad557ae4a6afa090569f83fb 2024-08-06T20:53:20.4644377Z SCCACHE_S3_KEY_PREFIX=pull 2024-08-06T20:53:20.4644615Z CUDA_VERSION=12.1.1 2024-08-06T20:53:20.4644854Z NV_LIBCUBLAS_PACKAGE=libcublas-12-1=12.1.3.1-1 2024-08-06T20:53:20.4645148Z NUM_TEST_SHARDS=5 2024-08-06T20:53:20.4645353Z UCX_HOME=/usr 2024-08-06T20:53:20.4645656Z NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE=cuda-nsight-compute-12-1=12.1.1-1 2024-08-06T20:53:20.4646336Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4647139Z JOB_NAME=linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:53:20.4658555Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4659339Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2024-08-06T20:53:20.4659807Z GITHUB_EVENT_NAME=pull_request 2024-08-06T20:53:20.4660067Z DASHBOARD_TAG= 2024-08-06T20:53:20.4660288Z GITHUB_RUN_ID=10273124344 2024-08-06T20:53:20.4660578Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-12-1=12.1.0.40-1 2024-08-06T20:53:20.4660913Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-12-1 2024-08-06T20:53:20.4661538Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4662149Z GITHUB_ACTOR=drisspg 2024-08-06T20:53:20.4662385Z NV_LIBNPP_DEV_VERSION=12.1.0.40-1 2024-08-06T20:53:20.4662651Z PR_NUMBER=132710 2024-08-06T20:53:20.4662875Z GITHUB_RUN_ATTEMPT=1 2024-08-06T20:53:20.4663110Z ANACONDA_PYTHON_VERSION=3.10 2024-08-06T20:53:20.4663427Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2024-08-06T20:53:20.4663747Z TERM=xterm 2024-08-06T20:53:20.4663962Z NV_LIBCUSPARSE_DEV_VERSION=12.1.0.106-1 2024-08-06T20:53:20.4664257Z INSTALLED_VISION=yes 2024-08-06T20:53:20.4664486Z BRANCH=pull/132710 2024-08-06T20:53:20.4664730Z OPENSSL_ROOT_DIR=/opt/openssl 2024-08-06T20:53:20.4665012Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2024-08-06T20:53:20.4665304Z CUDA_PATH=/usr/local/cuda 2024-08-06T20:53:20.4665773Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2024-08-06T20:53:20.4666297Z GITHUB_SERVER_URL=https://github.com 2024-08-06T20:53:20.4666633Z UCC_COMMIT=20eae37090a4ce1b32bcce6144ccad0b49943e0b 2024-08-06T20:53:20.4666953Z REENABLED_ISSUES= 2024-08-06T20:53:20.4667167Z SHLVL=1 2024-08-06T20:53:20.4667357Z MAX_JOBS=14 2024-08-06T20:53:20.4667561Z NV_CUDA_LIB_VERSION=12.1.1-1 2024-08-06T20:53:20.4667803Z NVARCH=x86_64 2024-08-06T20:53:20.4668017Z GITHUB_ACTOR_ID=32754868 2024-08-06T20:53:20.4668448Z GITHUB_WORKFLOW_SHA=bf5bb5a1585a03379137fab341e87c02c77e76cd 2024-08-06T20:53:20.4668809Z GITHUB_REF_NAME=132710/merge 2024-08-06T20:53:20.4669084Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-12-1 2024-08-06T20:53:20.4669489Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2024-08-06T20:53:20.4669884Z GITHUB_JOB=test 2024-08-06T20:53:20.4670128Z NV_LIBNCCL_PACKAGE=libnccl2=2.17.1-1+cuda12.1 2024-08-06T20:53:20.4670494Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-08-06T20:53:20.4670844Z NO_TEST_TIMEOUT=False 2024-08-06T20:53:20.4671077Z TD_DISTRIBUTED=False 2024-08-06T20:53:20.4671319Z NV_CUDA_NSIGHT_COMPUTE_VERSION=12.1.1-1 2024-08-06T20:53:20.4671705Z GITHUB_REPOSITORY=pytorch/pytorch 2024-08-06T20:53:20.4671982Z NV_NVPROF_VERSION=12.1.105-1 2024-08-06T20:53:20.4672230Z GITHUB_RETENTION_DAYS=90 2024-08-06T20:53:20.4672474Z OPENSSL_DIR=/opt/openssl 2024-08-06T20:53:20.4672722Z GITHUB_ACTION_REPOSITORY= 2024-08-06T20:53:20.4673415Z PATH=/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-06T20:53:20.4674159Z GITHUB_BASE_REF=gh/drisspg/23/base 2024-08-06T20:53:20.4674482Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2024-08-06T20:53:20.4674733Z CI=true 2024-08-06T20:53:20.4674948Z NV_LIBNCCL_PACKAGE_VERSION=2.17.1-1 2024-08-06T20:53:20.4675237Z GITHUB_REPOSITORY_OWNER=pytorch 2024-08-06T20:53:20.4675494Z JOB_ID=28428649080 2024-08-06T20:53:20.4675719Z INSTALLED_PROTOBUF=yes 2024-08-06T20:53:20.4675968Z GITHUB_HEAD_REF=gh/drisspg/23/head 2024-08-06T20:53:20.4676230Z GITHUB_ACTION_REF= 2024-08-06T20:53:20.4676501Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2024-08-06T20:53:20.4676828Z TEST_SHOWLOCALS=False 2024-08-06T20:53:20.4677064Z GITHUB_WORKFLOW=pull 2024-08-06T20:53:20.4677306Z DEBIAN_FRONTEND=noninteractive 2024-08-06T20:53:20.4677867Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4678447Z NO_TD=False 2024-08-06T20:53:20.4678665Z SKIP_SCCACHE_INITIALIZATION=1 2024-08-06T20:53:20.4678917Z _=/usr/bin/env 2024-08-06T20:53:20.4679215Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2024-08-06T20:53:20.4800936Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2024-08-06T20:53:20.4802197Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T20:53:20.4803221Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2024-08-06T20:53:20.4803963Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2024-08-06T20:53:20.4804379Z + BUILD_DIR=build 2024-08-06T20:53:20.4804600Z + BUILD_RENAMED_DIR=build_renamed 2024-08-06T20:53:20.4804871Z + BUILD_BIN_DIR=build/bin 2024-08-06T20:53:20.4805109Z + SHARD_NUMBER=2 2024-08-06T20:53:20.4805310Z + NUM_TEST_SHARDS=5 2024-08-06T20:53:20.4805579Z + export VALGRIND=ON 2024-08-06T20:53:20.4805894Z + VALGRIND=ON 2024-08-06T20:53:20.4806224Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *clang9* ]] 2024-08-06T20:53:20.4806567Z + [[ 0 == \1 ]] 2024-08-06T20:53:20.4806763Z + [[ False == \1 ]] 2024-08-06T20:53:20.4807038Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *bazel* ]] 2024-08-06T20:53:20.4807393Z ++ realpath build/custom_test_artifacts 2024-08-06T20:53:20.4818658Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2024-08-06T20:53:20.4819467Z + [[ -n '' ]] 2024-08-06T20:53:20.4819689Z + echo 'Environment variables' 2024-08-06T20:53:20.4819952Z Environment variables 2024-08-06T20:53:20.4820167Z + env 2024-08-06T20:53:20.4829755Z INSTALLED_DB=yes 2024-08-06T20:53:20.4830121Z NV_LIBCUBLAS_VERSION=12.1.3.1-1 2024-08-06T20:53:20.4830476Z NVIDIA_VISIBLE_DEVICES=all 2024-08-06T20:53:20.4830790Z NV_NVML_DEV_VERSION=12.1.105-1 2024-08-06T20:53:20.4831297Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-06T20:53:20.4831690Z CONTINUE_THROUGH_ERROR=False 2024-08-06T20:53:20.4832402Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.17.1-1+cuda12.1 2024-08-06T20:53:20.4832823Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.17.1-1 2024-08-06T20:53:20.4833249Z BUILD_ENVIRONMENT=linux-focal-cuda12.1-py3.10-gcc9-sm86 2024-08-06T20:53:20.4833629Z HOSTNAME=88943b22963e 2024-08-06T20:53:20.4834293Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4834871Z GITHUB_ACTION=__self 2024-08-06T20:53:20.4835113Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2024-08-06T20:53:20.4837720Z NVIDIA_REQUIRE_CUDA=cuda>=12.1 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=tesla,driver>=525,driver<526 brand=unknown,driver>=525,driver<526 brand=nvidia,driver>=525,driver<526 brand=nvidiartx,driver>=525,driver<526 brand=geforce,driver>=525,driver<526 brand=geforcertx,driver>=525,driver<526 brand=quadro,driver>=525,driver<526 brand=quadrortx,driver>=525,driver<526 brand=titan,driver>=525,driver<526 brand=titanrtx,driver>=525,driver<526 2024-08-06T20:53:20.4841090Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-12-1=12.1.3.1-1 2024-08-06T20:53:20.4841542Z NV_NVTX_VERSION=12.1.105-1 2024-08-06T20:53:20.4841818Z GITHUB_RUN_NUMBER=233985 2024-08-06T20:53:20.4842136Z TEST_CONFIG=default 2024-08-06T20:53:20.4842431Z GITHUB_REPOSITORY_OWNER_ID=21003710 2024-08-06T20:53:20.4842777Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2024-08-06T20:53:20.4843152Z NV_CUDA_CUDART_DEV_VERSION=12.1.105-1 2024-08-06T20:53:20.4843443Z NV_LIBCUSPARSE_VERSION=12.1.0.106-1 2024-08-06T20:53:20.4843716Z NV_LIBNPP_VERSION=12.1.0.40-1 2024-08-06T20:53:20.4843989Z GITHUB_TRIGGERING_ACTOR=drisspg 2024-08-06T20:53:20.4844301Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2024-08-06T20:53:20.4844621Z GITHUB_REF_TYPE=branch 2024-08-06T20:53:20.4844864Z TORCH_CUDA_ARCH_LIST=Maxwell 2024-08-06T20:53:20.4845124Z NCCL_VERSION=2.17.1-1 2024-08-06T20:53:20.4845386Z BASE_SHA=1736af7cf736184c356be1bb00f59fb2feea6d7d 2024-08-06T20:53:20.4845688Z XLA_CUDA= 2024-08-06T20:53:20.4845899Z HUGGING_FACE_HUB_TOKEN= 2024-08-06T20:53:20.4846240Z *** 2024-08-06T20:53:20.4846443Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2024-08-06T20:53:20.4846729Z GITHUB_REPOSITORY_ID=65600975 2024-08-06T20:53:20.4846974Z GITHUB_ACTIONS=true 2024-08-06T20:53:20.4847208Z NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T20:53:20.4847507Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-12-1=12.1.105-1 2024-08-06T20:53:20.4847842Z NV_LIBNPP_PACKAGE=libnpp-12-1=12.1.0.40-1 2024-08-06T20:53:20.4848156Z SHA1=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-06T20:53:20.4848474Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2024-08-06T20:53:20.4848797Z GITHUB_SHA=bf5bb5a1585a03379137fab341e87c02c77e76cd 2024-08-06T20:53:20.4849284Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/pull.yml@refs/pull/132710/merge 2024-08-06T20:53:20.4849726Z UCC_HOME=/usr 2024-08-06T20:53:20.4849939Z NV_LIBCUBLAS_DEV_VERSION=12.1.3.1-1 2024-08-06T20:53:20.4850221Z VERBOSE_TEST_LOGS=False 2024-08-06T20:53:20.4850460Z NVIDIA_PRODUCT_NAME=CUDA 2024-08-06T20:53:20.4850739Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-12-1 2024-08-06T20:53:20.4851062Z GITHUB_REF=refs/pull/132710/merge 2024-08-06T20:53:20.4851329Z NV_CUDA_CUDART_VERSION=12.1.105-1 2024-08-06T20:53:20.4851589Z SHARD_NUMBER=2 2024-08-06T20:53:20.4851814Z GITHUB_REF_PROTECTED=false 2024-08-06T20:53:20.4852053Z HOME=/var/lib/jenkins 2024-08-06T20:53:20.4852309Z GITHUB_API_URL=https://api.github.com 2024-08-06T20:53:20.4852625Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2024-08-06T20:53:20.4852940Z UCX_COMMIT=7bb2722ff2187a0cad557ae4a6afa090569f83fb 2024-08-06T20:53:20.4853266Z SCCACHE_S3_KEY_PREFIX=pull 2024-08-06T20:53:20.4853511Z CUDA_VERSION=12.1.1 2024-08-06T20:53:20.4853759Z NV_LIBCUBLAS_PACKAGE=libcublas-12-1=12.1.3.1-1 2024-08-06T20:53:20.4854214Z NUM_TEST_SHARDS=5 2024-08-06T20:53:20.4854587Z UCX_HOME=/usr 2024-08-06T20:53:20.4854905Z NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE=cuda-nsight-compute-12-1=12.1.1-1 2024-08-06T20:53:20.4855597Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4856418Z JOB_NAME=linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-06T20:53:20.4857204Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4857931Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2024-08-06T20:53:20.4858515Z GITHUB_EVENT_NAME=pull_request 2024-08-06T20:53:20.4858768Z DASHBOARD_TAG= 2024-08-06T20:53:20.4858988Z GITHUB_RUN_ID=10273124344 2024-08-06T20:53:20.4859272Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-12-1=12.1.0.40-1 2024-08-06T20:53:20.4859612Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-12-1 2024-08-06T20:53:20.4860237Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4860845Z GITHUB_ACTOR=drisspg 2024-08-06T20:53:20.4861081Z NV_LIBNPP_DEV_VERSION=12.1.0.40-1 2024-08-06T20:53:20.4861355Z PR_NUMBER=132710 2024-08-06T20:53:20.4861574Z GITHUB_RUN_ATTEMPT=1 2024-08-06T20:53:20.4861788Z VALGRIND=ON 2024-08-06T20:53:20.4862004Z ANACONDA_PYTHON_VERSION=3.10 2024-08-06T20:53:20.4862316Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2024-08-06T20:53:20.4862623Z TERM=xterm 2024-08-06T20:53:20.4862841Z NV_LIBCUSPARSE_DEV_VERSION=12.1.0.106-1 2024-08-06T20:53:20.4863135Z INSTALLED_VISION=yes 2024-08-06T20:53:20.4863353Z BRANCH=pull/132710 2024-08-06T20:53:20.4863586Z OPENSSL_ROOT_DIR=/opt/openssl 2024-08-06T20:53:20.4863864Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2024-08-06T20:53:20.4864178Z CUDA_PATH=/usr/local/cuda 2024-08-06T20:53:20.4864691Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2024-08-06T20:53:20.4865223Z GITHUB_SERVER_URL=https://github.com 2024-08-06T20:53:20.4865554Z UCC_COMMIT=20eae37090a4ce1b32bcce6144ccad0b49943e0b 2024-08-06T20:53:20.4865879Z REENABLED_ISSUES= 2024-08-06T20:53:20.4866090Z SHLVL=1 2024-08-06T20:53:20.4866272Z MAX_JOBS=14 2024-08-06T20:53:20.4866486Z NV_CUDA_LIB_VERSION=12.1.1-1 2024-08-06T20:53:20.4866729Z NVARCH=x86_64 2024-08-06T20:53:20.4866932Z GITHUB_ACTOR_ID=32754868 2024-08-06T20:53:20.4867253Z GITHUB_WORKFLOW_SHA=bf5bb5a1585a03379137fab341e87c02c77e76cd 2024-08-06T20:53:20.4867607Z GITHUB_REF_NAME=132710/merge 2024-08-06T20:53:20.4867885Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-12-1 2024-08-06T20:53:20.4868300Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2024-08-06T20:53:20.4868687Z GITHUB_JOB=test 2024-08-06T20:53:20.4868921Z NV_LIBNCCL_PACKAGE=libnccl2=2.17.1-1+cuda12.1 2024-08-06T20:53:20.4869292Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-08-06T20:53:20.4869651Z NO_TEST_TIMEOUT=False 2024-08-06T20:53:20.4869878Z TD_DISTRIBUTED=False 2024-08-06T20:53:20.4870116Z NV_CUDA_NSIGHT_COMPUTE_VERSION=12.1.1-1 2024-08-06T20:53:20.4870417Z GITHUB_REPOSITORY=pytorch/pytorch 2024-08-06T20:53:20.4870690Z NV_NVPROF_VERSION=12.1.105-1 2024-08-06T20:53:20.4870934Z GITHUB_RETENTION_DAYS=90 2024-08-06T20:53:20.4871183Z OPENSSL_DIR=/opt/openssl 2024-08-06T20:53:20.4871428Z GITHUB_ACTION_REPOSITORY= 2024-08-06T20:53:20.4872119Z PATH=/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-06T20:53:20.4872861Z GITHUB_BASE_REF=gh/drisspg/23/base 2024-08-06T20:53:20.4873147Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2024-08-06T20:53:20.4873390Z CI=true 2024-08-06T20:53:20.4873596Z NV_LIBNCCL_PACKAGE_VERSION=2.17.1-1 2024-08-06T20:53:20.4873878Z GITHUB_REPOSITORY_OWNER=pytorch 2024-08-06T20:53:20.4874126Z JOB_ID=28428649080 2024-08-06T20:53:20.4874445Z INSTALLED_PROTOBUF=yes 2024-08-06T20:53:20.4874695Z GITHUB_HEAD_REF=gh/drisspg/23/head 2024-08-06T20:53:20.4874954Z GITHUB_ACTION_REF= 2024-08-06T20:53:20.4875241Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2024-08-06T20:53:20.4875567Z TEST_SHOWLOCALS=False 2024-08-06T20:53:20.4875798Z GITHUB_WORKFLOW=pull 2024-08-06T20:53:20.4876041Z DEBIAN_FRONTEND=noninteractive 2024-08-06T20:53:20.4876607Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_df29f495-d13c-4d85-96f8-26c2a5f68d53 2024-08-06T20:53:20.4877182Z NO_TD=False 2024-08-06T20:53:20.4877396Z SKIP_SCCACHE_INITIALIZATION=1 2024-08-06T20:53:20.4877657Z _=/usr/bin/env 2024-08-06T20:53:20.4877955Z + echo 'Testing pytorch' 2024-08-06T20:53:20.4878188Z Testing pytorch 2024-08-06T20:53:20.4878414Z + export LANG=C.UTF-8 2024-08-06T20:53:20.4878636Z + LANG=C.UTF-8 2024-08-06T20:53:20.4878842Z + PR_NUMBER=132710 2024-08-06T20:53:20.4879058Z + [[ default == \d\e\f\a\u\l\t ]] 2024-08-06T20:53:20.4879335Z + export CUDA_VISIBLE_DEVICES=0 2024-08-06T20:53:20.4879592Z + CUDA_VISIBLE_DEVICES=0 2024-08-06T20:53:20.4879833Z + export HIP_VISIBLE_DEVICES=0 2024-08-06T20:53:20.4880089Z + HIP_VISIBLE_DEVICES=0 2024-08-06T20:53:20.4880329Z + [[ default == \d\i\s\t\r\i\b\u\t\e\d ]] 2024-08-06T20:53:20.4880601Z + [[ default == \s\l\o\w ]] 2024-08-06T20:53:20.4880953Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *slow-gradcheck* ]] 2024-08-06T20:53:20.4881383Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *cuda* ]] 2024-08-06T20:53:20.4881735Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2024-08-06T20:53:20.4882053Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2024-08-06T20:53:20.4882343Z + [[ default == *crossref* ]] 2024-08-06T20:53:20.4882652Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *rocm* ]] 2024-08-06T20:53:20.4883037Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *xpu* ]] 2024-08-06T20:53:20.4883428Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *-bazel-* ]] 2024-08-06T20:53:20.4883776Z + pip_install --user ninja==1.10.2 2024-08-06T20:53:20.4884144Z + pip install --progress-bar off --user ninja==1.10.2 2024-08-06T20:53:21.7664920Z Collecting ninja==1.10.2 2024-08-06T20:53:21.7821078Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2024-08-06T20:53:21.8399894Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2024-08-06T20:53:22.7499540Z Installing collected packages: ninja 2024-08-06T20:53:22.7587938Z  WARNING: The script ninja is installed in '/var/lib/jenkins/.local/bin' which is not on PATH. 2024-08-06T20:53:22.7588793Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2024-08-06T20:53:22.7866163Z Successfully installed ninja-1.10.2 2024-08-06T20:53:22.8643805Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-06T20:53:22.8645234Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-06T20:53:22.8646106Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *aarch64* ]] 2024-08-06T20:53:22.8646446Z + install_tlparse 2024-08-06T20:53:22.8646686Z + pip_install --user tlparse==0.3.7 2024-08-06T20:53:22.8647010Z + pip install --progress-bar off --user tlparse==0.3.7 2024-08-06T20:53:23.2591103Z Collecting tlparse==0.3.7 2024-08-06T20:53:23.2746906Z Downloading tlparse-0.3.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (346 bytes) 2024-08-06T20:53:23.2851509Z Downloading tlparse-0.3.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB) 2024-08-06T20:53:24.1682064Z Installing collected packages: tlparse 2024-08-06T20:53:24.2078696Z Successfully installed tlparse-0.3.7 2024-08-06T20:53:24.2832266Z ++ python -m site --user-base 2024-08-06T20:53:24.3014791Z + PATH=/var/lib/jenkins/.local/bin:/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-06T20:53:24.3015817Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *asan* ]] 2024-08-06T20:53:24.3016263Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *-debug* ]] 2024-08-06T20:53:24.3016664Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *-bazel-* ]] 2024-08-06T20:53:24.3017196Z + echo 'We are not in debug mode: linux-focal-cuda12.1-py3.10-gcc9-sm86. Expect the assertion to pass' 2024-08-06T20:53:24.3017853Z We are not in debug mode: linux-focal-cuda12.1-py3.10-gcc9-sm86. Expect the assertion to pass 2024-08-06T20:53:24.3019949Z + cd test 2024-08-06T20:53:24.3020822Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2024-08-06T20:53:25.9356879Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2024-08-06T20:53:25.9357237Z + [[ default == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2024-08-06T20:53:25.9361273Z + DYNAMO_BENCHMARK_FLAGS=() 2024-08-06T20:53:25.9361563Z + [[ default == *dynamo_eager* ]] 2024-08-06T20:53:25.9361836Z + [[ default == *aot_eager* ]] 2024-08-06T20:53:25.9362082Z + [[ default == *aot_inductor* ]] 2024-08-06T20:53:25.9362344Z + [[ default == *inductor* ]] 2024-08-06T20:53:25.9362595Z + [[ default == *dynamic* ]] 2024-08-06T20:53:25.9362833Z + [[ default == *cpu* ]] 2024-08-06T20:53:25.9363116Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2024-08-06T20:53:25.9393570Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *libtorch* ]] 2024-08-06T20:53:25.9394076Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *-bazel-* ]] 2024-08-06T20:53:25.9396410Z + cd test 2024-08-06T20:53:25.9397212Z + python -c 'import torch; print(torch.__config__.show())' 2024-08-06T20:53:27.4108320Z PyTorch built with: 2024-08-06T20:53:27.4108655Z - GCC 9.4 2024-08-06T20:53:27.4108868Z - C++ Version: 201703 2024-08-06T20:53:27.4109409Z - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications 2024-08-06T20:53:27.4110052Z - Intel(R) MKL-DNN v3.4.2 (Git Hash 1137e04ec0b5251ca2b4400a4fd3c667ce843d67) 2024-08-06T20:53:27.4110446Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2024-08-06T20:53:27.4110756Z - LAPACK is enabled (usually provided by MKL) 2024-08-06T20:53:27.4111057Z - NNPACK is enabled 2024-08-06T20:53:27.4111289Z - CPU capability usage: AVX2 2024-08-06T20:53:27.4111543Z - CUDA Runtime 12.1 2024-08-06T20:53:27.4111857Z - NVCC architecture flags: -gencode;arch=compute_86,code=sm_86 2024-08-06T20:53:27.4112227Z - CuDNN 90.1 (built against CUDA 12.4) 2024-08-06T20:53:27.4112516Z - Magma 2.6.1 2024-08-06T20:53:27.4119290Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=9.1.0, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Werror -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.5.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 2024-08-06T20:53:27.4123730Z 2024-08-06T20:53:27.7319700Z + cd test 2024-08-06T20:53:27.7320281Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2024-08-06T20:53:29.0918996Z ATen/Parallel: 2024-08-06T20:53:29.0919427Z at::get_num_threads() : 8 2024-08-06T20:53:29.0919843Z at::get_num_interop_threads() : 16 2024-08-06T20:53:29.0920270Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2024-08-06T20:53:29.0920684Z omp_get_max_threads() : 8 2024-08-06T20:53:29.0921474Z Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications 2024-08-06T20:53:29.0922304Z mkl_get_max_threads() : 8 2024-08-06T20:53:29.0922836Z Intel(R) MKL-DNN v3.4.2 (Git Hash 1137e04ec0b5251ca2b4400a4fd3c667ce843d67) 2024-08-06T20:53:29.0923933Z std::thread::hardware_concurrency() : 16 2024-08-06T20:53:29.0924372Z Environment variables: 2024-08-06T20:53:29.0924727Z OMP_NUM_THREADS : [not set] 2024-08-06T20:53:29.0925108Z MKL_NUM_THREADS : [not set] 2024-08-06T20:53:29.0925489Z ATen parallel backend: OpenMP 2024-08-06T20:53:29.0925758Z 2024-08-06T20:53:29.3727571Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *aarch64* ]] 2024-08-06T20:53:29.3727980Z + [[ default == *backward* ]] 2024-08-06T20:53:29.3728232Z + [[ default == *xla* ]] 2024-08-06T20:53:29.3728460Z + [[ default == *executorch* ]] 2024-08-06T20:53:29.3728722Z + [[ default == \j\i\t\_\l\e\g\a\c\y ]] 2024-08-06T20:53:29.3729064Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *libtorch* ]] 2024-08-06T20:53:29.3729402Z + [[ default == distributed ]] 2024-08-06T20:53:29.3729667Z + [[ default == *inductor_distributed* ]] 2024-08-06T20:53:29.3729963Z + [[ default == *inductor-halide* ]] 2024-08-06T20:53:29.3730256Z + [[ default == *inductor-micro-benchmark* ]] 2024-08-06T20:53:29.3730569Z + [[ default == *huggingface* ]] 2024-08-06T20:53:29.3730822Z + [[ default == *timm* ]] 2024-08-06T20:53:29.3731050Z + [[ default == *torchbench* ]] 2024-08-06T20:53:29.3731362Z + [[ default == *inductor_cpp_wrapper_abi_compatible* ]] 2024-08-06T20:53:29.3731789Z + [[ default == *inductor* ]] 2024-08-06T20:53:29.3732038Z + [[ default == *dynamo* ]] 2024-08-06T20:53:29.3732398Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *rocm* ]] 2024-08-06T20:53:29.3732718Z + [[ 2 == 1 ]] 2024-08-06T20:53:29.3732917Z + [[ 2 == 2 ]] 2024-08-06T20:53:29.3733157Z + [[ 5 -gt 1 ]] 2024-08-06T20:53:29.3733446Z + install_torchvision 2024-08-06T20:53:29.3733705Z + local orig_preload 2024-08-06T20:53:29.3733918Z + local commit 2024-08-06T20:53:29.3734212Z ++ get_pinned_commit vision 2024-08-06T20:53:29.3734483Z ++ cat .github/ci_commit_pins/vision.txt 2024-08-06T20:53:29.3751038Z + commit=d23a6e1664d20707c11781299611436e1f0c104f 2024-08-06T20:53:29.3751359Z + orig_preload= 2024-08-06T20:53:29.3751647Z + '[' -n '' ']' 2024-08-06T20:53:29.3752156Z + pip_install --no-use-pep517 --user git+https://github.com/pytorch/vision.git@d23a6e1664d20707c11781299611436e1f0c104f 2024-08-06T20:53:29.3753047Z + pip install --progress-bar off --no-use-pep517 --user git+https://github.com/pytorch/vision.git@d23a6e1664d20707c11781299611436e1f0c104f 2024-08-06T20:53:29.7276847Z Collecting git+https://github.com/pytorch/vision.git@d23a6e1664d20707c11781299611436e1f0c104f 2024-08-06T20:53:29.7281778Z Cloning https://github.com/pytorch/vision.git (to revision d23a6e1664d20707c11781299611436e1f0c104f) to /tmp/pip-req-build-6o8g622d 2024-08-06T20:53:29.7312863Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-6o8g622d 2024-08-06T20:53:31.2534455Z Running command git rev-parse -q --verify 'sha^d23a6e1664d20707c11781299611436e1f0c104f' 2024-08-06T20:53:31.2559706Z Running command git fetch -q https://github.com/pytorch/vision.git d23a6e1664d20707c11781299611436e1f0c104f 2024-08-06T20:53:32.5639007Z Running command git checkout -q d23a6e1664d20707c11781299611436e1f0c104f 2024-08-06T20:53:32.8527985Z Resolved https://github.com/pytorch/vision.git to commit d23a6e1664d20707c11781299611436e1f0c104f 2024-08-06T20:53:35.3532269Z Preparing metadata (setup.py) ... [?25l- \ done 2024-08-06T20:53:35.3578340Z [?25hRequirement already satisfied: numpy in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.19.0a0+d23a6e1) (1.21.2) 2024-08-06T20:53:35.3580408Z Requirement already satisfied: torch in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.19.0a0+d23a6e1) (2.5.0a0+gitb9d86fa) 2024-08-06T20:53:35.3586144Z Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.19.0a0+d23a6e1) (10.3.0) 2024-08-06T20:53:35.3808486Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (3.13.1) 2024-08-06T20:53:35.3813615Z Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (4.12.2) 2024-08-06T20:53:35.3816688Z Requirement already satisfied: networkx in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (2.8.8) 2024-08-06T20:53:35.3819805Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (3.1.4) 2024-08-06T20:53:35.3822500Z Requirement already satisfied: fsspec in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (2024.6.1) 2024-08-06T20:53:35.3833025Z Requirement already satisfied: sympy>=1.13.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (1.13.1) 2024-08-06T20:53:35.3864660Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.0->torch->torchvision==0.19.0a0+d23a6e1) (1.3.0) 2024-08-06T20:53:35.4753059Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch->torchvision==0.19.0a0+d23a6e1) (2.1.5) 2024-08-06T20:53:35.4965781Z Building wheels for collected packages: torchvision 2024-08-06T20:54:53.9351773Z Building wheel for torchvision (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / done 2024-08-06T20:54:53.9386316Z [?25h Created wheel for torchvision: filename=torchvision-0.19.0a0+d23a6e1-cp310-cp310-linux_x86_64.whl size=2115777 sha256=ef1bca23eb44688ac1dd1aea5ccb1a482b5c400275a3f28388009fa1a447770b 2024-08-06T20:54:53.9387518Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/0e/56/35/02931e71eb23fd2b85591c7ec05b733ca7c8b328a2fd151f96 2024-08-06T20:54:53.9424806Z Successfully built torchvision 2024-08-06T20:54:54.6418853Z Installing collected packages: torchvision 2024-08-06T20:54:55.0597846Z Successfully installed torchvision-0.19.0a0+d23a6e1 2024-08-06T20:54:55.1974916Z + '[' -n '' ']' 2024-08-06T20:54:55.1975181Z + test_python_shard 2 2024-08-06T20:54:55.1975417Z + [[ -z 5 ]] 2024-08-06T20:54:55.1975880Z + python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --shard 2 5 --verbose 2024-08-06T20:54:55.2943831Z /var/lib/jenkins/workspace/test/run_test.py:21: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html 2024-08-06T20:54:55.2945346Z import pkg_resources 2024-08-06T20:54:58.8195535Z Downloading https://ossci-metrics.s3.amazonaws.com/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2024-08-06T20:54:58.8895354Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2024-08-06T20:54:58.9258420Z Ignoring disabled issues: [''] 2024-08-06T20:54:58.9367256Z Found test times from artifacts 2024-08-06T20:54:58.9785012Z Found test times from artifacts 2024-08-06T20:54:58.9797811Z Running 25% of tests based on TD 2024-08-06T20:54:59.0103497Z Running parallel tests on 3 processes 2024-08-06T20:54:59.0106241Z Name: tests to run (est. time: 67.61min) 2024-08-06T20:54:59.0106575Z Serial tests (0): 2024-08-06T20:54:59.0106799Z Parallel tests (28): 2024-08-06T20:54:59.0107031Z test_nestedtensor 1/1 2024-08-06T20:54:59.0107275Z test_decomp 4/22 2024-08-06T20:54:59.0107494Z test_decomp 6/22 2024-08-06T20:54:59.0107703Z test_decomp 7/22 2024-08-06T20:54:59.0107920Z test_decomp 17/22 2024-08-06T20:54:59.0108132Z test_decomp 21/22 2024-08-06T20:54:59.0108374Z test_decomp 22/22 2024-08-06T20:54:59.0108642Z inductor/test_torchinductor_opinfo 8/16 2024-08-06T20:54:59.0108967Z functorch/test_ops 3/7 2024-08-06T20:54:59.0109235Z functorch/test_ops 4/7 2024-08-06T20:54:59.0109728Z functorch/test_ops 7/7 2024-08-06T20:54:59.0110025Z test_ops 2/8 2024-08-06T20:54:59.0110303Z test_ops 4/8 2024-08-06T20:54:59.0110506Z test_ops 6/8 2024-08-06T20:54:59.0110721Z inductor/test_aot_inductor 7/16 2024-08-06T20:54:59.0111002Z inductor/test_aot_inductor 9/16 2024-08-06T20:54:59.0111286Z inductor/test_aot_inductor 11/16 2024-08-06T20:54:59.0111591Z inductor/test_torchinductor_dynamic_shapes 2/6 2024-08-06T20:54:59.0111944Z inductor/test_torchinductor_dynamic_shapes 3/6 2024-08-06T20:54:59.0112289Z inductor/test_torchinductor_dynamic_shapes 4/6 2024-08-06T20:54:59.0112610Z inductor/test_cuda_cpp_wrapper 4/5 2024-08-06T20:54:59.0112909Z inductor/test_inductor_freezing 1/1 2024-08-06T20:54:59.0113201Z inductor/test_fx_fusion 1/1 2024-08-06T20:54:59.0113451Z torch_np/test_dtype 1/1 2024-08-06T20:54:59.0113689Z test_deploy 1/1 2024-08-06T20:54:59.0113914Z profiler/test_cpp_thread 1/1 2024-08-06T20:54:59.0114177Z dynamo/test_exceptions 1/1 2024-08-06T20:54:59.0114443Z test_fx_experimental 1/1 2024-08-06T20:54:59.0114726Z Name: excluded (est. time: 30.43min) 2024-08-06T20:54:59.0114980Z Serial tests (4): 2024-08-06T20:54:59.0115210Z inductor/test_flex_attention 2/2 2024-08-06T20:54:59.0115495Z test_multiprocessing_spawn 1/1 2024-08-06T20:54:59.0115757Z test_fx 1/1 2024-08-06T20:54:59.0115970Z test_tensorexpr 1/1 2024-08-06T20:54:59.0116201Z Parallel tests (54): 2024-08-06T20:54:59.0116427Z test_quantization 6/6 2024-08-06T20:54:59.0116667Z test_schema_check 1/1 2024-08-06T20:54:59.0116918Z inductor/test_fused_attention 1/1 2024-08-06T20:54:59.0117211Z inductor/test_efficient_conv_bn_eval 1/1 2024-08-06T20:54:59.0117510Z functorch/test_control_flow 1/1 2024-08-06T20:54:59.0117797Z inductor/test_kernel_benchmark 1/1 2024-08-06T20:54:59.0118071Z test_binary_ufuncs 1/1 2024-08-06T20:54:59.0118318Z inductor/test_codecache 1/1 2024-08-06T20:54:59.0118638Z inductor/test_cudagraph_trees_expandable_segments 1/1 2024-08-06T20:54:59.0118995Z inductor/test_group_batch_fusion 1/1 2024-08-06T20:54:59.0119287Z inductor/test_cuda_repro 1/1 2024-08-06T20:54:59.0119551Z dynamo/test_repros 1/1 2024-08-06T20:54:59.0119780Z test_typing 1/1 2024-08-06T20:54:59.0119998Z dynamo/test_hooks 1/1 2024-08-06T20:54:59.0120241Z test_scatter_gather_ops 1/1 2024-08-06T20:54:59.0120501Z export/test_converter 1/1 2024-08-06T20:54:59.0120760Z test_public_bindings 1/1 2024-08-06T20:54:59.0121014Z dynamo/test_autograd_function 1/1 2024-08-06T20:54:59.0121295Z nn/test_multihead_attention 1/1 2024-08-06T20:54:59.0121564Z test_dynamic_shapes 1/1 2024-08-06T20:54:59.0121828Z torch_np/numpy_tests/linalg/test_linalg 1/1 2024-08-06T20:54:59.0122139Z profiler/test_memory_profiler 1/1 2024-08-06T20:54:59.0122415Z test_import_stats 1/1 2024-08-06T20:54:59.0122672Z torch_np/numpy_tests/core/test_numeric 1/1 2024-08-06T20:54:59.0122974Z profiler/test_torch_tidy 1/1 2024-08-06T20:54:59.0123242Z nn/test_init 1/1 2024-08-06T20:54:59.0123503Z test_segment_reductions 1/1 2024-08-06T20:54:59.0123770Z torch_np/test_reductions 1/1 2024-08-06T20:54:59.0124042Z profiler/test_execution_trace 1/1 2024-08-06T20:54:59.0124309Z test_indexing 1/1 2024-08-06T20:54:59.0124647Z export/test_experimental 1/1 2024-08-06T20:54:59.0124933Z benchmark_utils/test_benchmark_utils 1/1 2024-08-06T20:54:59.0125215Z test_out_dtype_op 1/1 2024-08-06T20:54:59.0125455Z nn/test_load_state_dict 1/1 2024-08-06T20:54:59.0125700Z test_fx_passes 1/1 2024-08-06T20:54:59.0125930Z dynamo/test_python_autograd 1/1 2024-08-06T20:54:59.0126200Z dynamo/test_optimizers 1/1 2024-08-06T20:54:59.0126486Z torch_np/numpy_tests/core/test_shape_base 1/1 2024-08-06T20:54:59.0126781Z test_prims 1/1 2024-08-06T20:54:59.0126998Z export/test_lift_unlift 1/1 2024-08-06T20:54:59.0127242Z dynamo/test_modes 1/1 2024-08-06T20:54:59.0127481Z test_cuda_sanitizer 1/1 2024-08-06T20:54:59.0127822Z profiler/test_profiler_tree 1/1 2024-08-06T20:54:59.0128095Z torch_np/test_scalars_0D_arrays 1/1 2024-08-06T20:54:59.0128413Z torch_np/numpy_tests/core/test_numerictypes 1/1 2024-08-06T20:54:59.0128754Z torch_np/numpy_tests/lib/test_arraypad 1/1 2024-08-06T20:54:59.0129053Z test_numba_integration 1/1 2024-08-06T20:54:59.0129303Z test_complex 1/1 2024-08-06T20:54:59.0129537Z torch_np/test_nep50_examples 1/1 2024-08-06T20:54:59.0129796Z nn/test_pruning 1/1 2024-08-06T20:54:59.0130029Z test_jit_llga_fuser 1/1 2024-08-06T20:54:59.0130282Z inductor/test_xpu_basic 1/1 2024-08-06T20:54:59.0130520Z xpu/test_gemm 1/1 2024-08-06T20:54:59.0130754Z optim/test_optim 1/1 2024-08-06T20:54:59.0131098Z Starting test batch 'tests to run' 0.0 seconds after initiating testing 2024-08-06T20:54:59.0174783Z Running test_nestedtensor 1/1 ... [2024-08-06 20:54:59.017039] 2024-08-06T20:54:59.0178624Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:54:59.017385] 2024-08-06T20:55:03.7418024Z 2024-08-06T20:55:03.7418980Z test_nestedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.1_c7c818116737d768_.log 2024-08-06T20:55:03.7419737Z Running 0 items in this shard: 2024-08-06T20:55:03.7419914Z 2024-08-06T20:55:03.7423498Z Running test_decomp 4/22 ... [2024-08-06 20:55:03.742078] 2024-08-06T20:55:03.7427411Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=4', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:03.742421] 2024-08-06T20:55:09.8678764Z 2024-08-06T20:55:09.8680690Z test_decomp 4/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_4.22_1c993cdc6879f82f_.log 2024-08-06T20:55:09.8681643Z Running 0 items in this shard: 2024-08-06T20:55:09.8681830Z 2024-08-06T20:55:09.8683185Z Running test_decomp 6/22 ... [2024-08-06 20:55:09.867907] 2024-08-06T20:55:09.8690089Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=6', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:09.868405] 2024-08-06T20:55:15.9943810Z 2024-08-06T20:55:15.9944532Z test_decomp 6/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_6.22_723b1d41e7b06743_.log 2024-08-06T20:55:15.9945145Z Running 0 items in this shard: 2024-08-06T20:55:15.9945317Z 2024-08-06T20:55:15.9945842Z Running test_decomp 7/22 ... [2024-08-06 20:55:15.994302] 2024-08-06T20:55:15.9950013Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=7', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:15.994635] 2024-08-06T20:55:22.0713149Z 2024-08-06T20:55:22.0714632Z test_decomp 7/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_7.22_0225f82664358233_.log 2024-08-06T20:55:22.0715498Z Running 0 items in this shard: 2024-08-06T20:55:22.0715673Z 2024-08-06T20:55:22.0716193Z Running test_decomp 17/22 ... [2024-08-06 20:55:22.071296] 2024-08-06T20:55:22.0719855Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=17', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:22.071631] 2024-08-06T20:55:28.1975479Z 2024-08-06T20:55:28.1976407Z test_decomp 17/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_17.22_04aaac7bcf595a12_.log 2024-08-06T20:55:28.1977452Z Running 0 items in this shard: 2024-08-06T20:55:28.1977627Z 2024-08-06T20:55:28.1978226Z Running test_decomp 21/22 ... [2024-08-06 20:55:28.197512] 2024-08-06T20:55:28.1982379Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=21', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:28.197875] 2024-08-06T20:55:34.3239093Z 2024-08-06T20:55:34.3240711Z test_decomp 21/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_21.22_a8b38bfe130b23b0_.log 2024-08-06T20:55:34.3241931Z Running 0 items in this shard: 2024-08-06T20:55:34.3242212Z 2024-08-06T20:55:34.3242460Z Running test_decomp 22/22 ... [2024-08-06 20:55:34.323931] 2024-08-06T20:55:34.3248106Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=22', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:34.324393] 2024-08-06T20:55:40.4502341Z 2024-08-06T20:55:40.4503397Z test_decomp 22/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_22.22_5f62108bfb875684_.log 2024-08-06T20:55:40.4504179Z Running 0 items in this shard: 2024-08-06T20:55:40.4504359Z 2024-08-06T20:55:40.4504625Z Running inductor/test_torchinductor_opinfo 8/16 ... [2024-08-06 20:55:40.450199] 2024-08-06T20:55:40.4508955Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=8', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:40.450528] 2024-08-06T20:55:49.8826550Z 2024-08-06T20:55:49.8827478Z inductor/test_torchinductor_opinfo 8/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.16_bc0f63e78032244d_.log 2024-08-06T20:55:49.8828269Z Running 0 items in this shard: 2024-08-06T20:55:49.8828474Z 2024-08-06T20:55:49.8830198Z Running functorch/test_ops 3/7 ... [2024-08-06 20:55:49.882711] 2024-08-06T20:55:49.8834038Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'serial', '--shard-id=3', '--num-shards=7', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:49.883063] 2024-08-06T20:55:56.4105189Z 2024-08-06T20:55:56.4106320Z functorch/test_ops 3/7 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_3.7_3bf65b2a552bdb0d_.log 2024-08-06T20:55:56.4106987Z Running 0 items in this shard: 2024-08-06T20:55:56.4107202Z 2024-08-06T20:55:56.4107759Z Running functorch/test_ops 4/7 ... [2024-08-06 20:55:56.410473] 2024-08-06T20:55:56.4113365Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'serial', '--shard-id=4', '--num-shards=7', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:55:56.410816] 2024-08-06T20:56:02.9883091Z 2024-08-06T20:56:02.9883825Z functorch/test_ops 4/7 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_4.7_f1708022e7ee2246_.log 2024-08-06T20:56:02.9884491Z Running 0 items in this shard: 2024-08-06T20:56:02.9884663Z 2024-08-06T20:56:02.9886894Z Running functorch/test_ops 7/7 ... [2024-08-06 20:56:02.988334] 2024-08-06T20:56:02.9890846Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'serial', '--shard-id=7', '--num-shards=7', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:56:02.988700] 2024-08-06T20:56:09.5157081Z 2024-08-06T20:56:09.5157972Z functorch/test_ops 7/7 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_7.7_6e93807924724224_.log 2024-08-06T20:56:09.5158741Z Running 0 items in this shard: 2024-08-06T20:56:09.5158924Z 2024-08-06T20:56:09.5159066Z Running test_ops 2/8 ... [2024-08-06 20:56:09.515551] 2024-08-06T20:56:09.5162576Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=2', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:56:09.515896] 2024-08-06T20:56:21.1023749Z 2024-08-06T20:56:21.1024679Z test_ops 2/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_2.8_44a8717aa20bbc2b_.log 2024-08-06T20:56:21.1025264Z Running 0 items in this shard: 2024-08-06T20:56:21.1025465Z 2024-08-06T20:56:21.1027288Z Running test_ops 4/8 ... [2024-08-06 20:56:21.102433] 2024-08-06T20:56:21.1031289Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=4', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:56:21.102794] 2024-08-06T20:56:32.6879504Z 2024-08-06T20:56:32.6880684Z test_ops 4/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_4.8_f4354431713e9870_.log 2024-08-06T20:56:32.6881368Z Running 0 items in this shard: 2024-08-06T20:56:32.6881550Z 2024-08-06T20:56:32.6882570Z Running test_ops 6/8 ... [2024-08-06 20:56:32.687970] 2024-08-06T20:56:32.6886979Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=6', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:56:32.688341] 2024-08-06T20:56:44.2250746Z 2024-08-06T20:56:44.2251843Z test_ops 6/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_6.8_92b29bad7748e831_.log 2024-08-06T20:56:44.2252980Z Running 0 items in this shard: 2024-08-06T20:56:44.2253366Z 2024-08-06T20:56:44.2253794Z Running inductor/test_aot_inductor 7/16 ... [2024-08-06 20:56:44.225052] 2024-08-06T20:56:44.2257845Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=7', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:56:44.225415] 2024-08-06T20:56:51.0023749Z 2024-08-06T20:56:51.0025113Z inductor/test_aot_inductor 7/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_7.16_eb833b86d3a09c09_.log 2024-08-06T20:56:51.0025917Z Running 0 items in this shard: 2024-08-06T20:56:51.0026104Z 2024-08-06T20:56:51.0027149Z Running inductor/test_aot_inductor 9/16 ... [2024-08-06 20:56:51.002414] 2024-08-06T20:56:51.0031707Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=9', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:56:51.002784] 2024-08-06T20:56:57.8309122Z 2024-08-06T20:56:57.8310486Z inductor/test_aot_inductor 9/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_9.16_7c48c5cae5e1f9ce_.log 2024-08-06T20:56:57.8311244Z Running 0 items in this shard: 2024-08-06T20:56:57.8311490Z 2024-08-06T20:56:57.8312011Z Running inductor/test_aot_inductor 11/16 ... [2024-08-06 20:56:57.830825] 2024-08-06T20:56:57.8315925Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=11', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:56:57.831203] 2024-08-06T20:57:04.6629215Z 2024-08-06T20:57:04.6630526Z inductor/test_aot_inductor 11/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_11.16_baacaa68a282cd62_.log 2024-08-06T20:57:04.6631289Z Running 0 items in this shard: 2024-08-06T20:57:04.6631525Z 2024-08-06T20:57:04.6633070Z Running inductor/test_torchinductor_dynamic_shapes 2/6 ... [2024-08-06 20:57:04.662922] 2024-08-06T20:57:04.6636824Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'serial', '--shard-id=2', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:04.663281] 2024-08-06T20:57:10.9898150Z 2024-08-06T20:57:10.9899697Z inductor/test_torchinductor_dynamic_shapes 2/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_2.6_e796776bea6851ef_.log 2024-08-06T20:57:10.9901020Z Running 0 items in this shard: 2024-08-06T20:57:10.9901263Z 2024-08-06T20:57:10.9901696Z Running inductor/test_torchinductor_dynamic_shapes 3/6 ... [2024-08-06 20:57:10.989707] 2024-08-06T20:57:10.9905953Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'serial', '--shard-id=3', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:10.990140] 2024-08-06T20:57:26.1333677Z 2024-08-06T20:57:26.1335463Z inductor/test_torchinductor_dynamic_shapes 3/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_3.6_1cad0782905fa8f3_.log 2024-08-06T20:57:26.1337462Z Running 2 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_block_sizes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_block_sizes_dynamic_shapes_cuda 2024-08-06T20:57:26.1338498Z 2024-08-06T20:57:26.1338789Z Running inductor/test_torchinductor_dynamic_shapes 4/6 ... [2024-08-06 20:57:26.133587] 2024-08-06T20:57:26.1343197Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'serial', '--shard-id=4', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:26.133979] 2024-08-06T20:57:32.4214050Z 2024-08-06T20:57:32.4215514Z inductor/test_torchinductor_dynamic_shapes 4/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_4.6_944c9e056a284ff9_.log 2024-08-06T20:57:32.4216791Z Running 0 items in this shard: 2024-08-06T20:57:32.4217042Z 2024-08-06T20:57:32.4217344Z Running inductor/test_cuda_cpp_wrapper 4/5 ... [2024-08-06 20:57:32.421157] 2024-08-06T20:57:32.4219494Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_cpp_wrapper.py', '-m', 'serial', '--shard-id=4', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:32.421522] 2024-08-06T20:57:38.9497799Z 2024-08-06T20:57:38.9498853Z inductor/test_cuda_cpp_wrapper 4/5 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_cpp_wrapper_4.5_d155ac4deb1da02c_.log 2024-08-06T20:57:38.9499615Z Running 0 items in this shard: 2024-08-06T20:57:38.9499790Z 2024-08-06T20:57:38.9500009Z Running inductor/test_inductor_freezing 1/1 ... [2024-08-06 20:57:38.949538] 2024-08-06T20:57:38.9502329Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_freezing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:38.949901] 2024-08-06T20:57:45.0780913Z 2024-08-06T20:57:45.0782042Z inductor/test_inductor_freezing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_freezing_1.1_f328c1a4beea5017_.log 2024-08-06T20:57:45.0782826Z Running 0 items in this shard: 2024-08-06T20:57:45.0783011Z 2024-08-06T20:57:45.0784982Z Running inductor/test_fx_fusion 1/1 ... [2024-08-06 20:57:45.078144] 2024-08-06T20:57:45.0789419Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fx_fusion.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:45.078574] 2024-08-06T20:57:48.8009648Z 2024-08-06T20:57:48.8010596Z inductor/test_fx_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fx_fusion_1.1_658a5163b038c898_.log 2024-08-06T20:57:48.8011402Z Running 0 items in this shard: 2024-08-06T20:57:48.8011583Z 2024-08-06T20:57:48.8011786Z Running torch_np/test_dtype 1/1 ... [2024-08-06 20:57:48.800884] 2024-08-06T20:57:48.8016143Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_dtype.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:48.801223] 2024-08-06T20:57:51.6715805Z 2024-08-06T20:57:51.6716936Z torch_np/test_dtype 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_dtype_1.1_15859ede88a296a5_.log 2024-08-06T20:57:51.6717713Z Running 0 items in this shard: 2024-08-06T20:57:51.6717959Z 2024-08-06T20:57:51.6718554Z Running test_deploy 1/1 ... [2024-08-06 20:57:51.671558] 2024-08-06T20:57:51.6723449Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_deploy.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:51.671950] 2024-08-06T20:57:54.4918000Z 2024-08-06T20:57:54.4918906Z test_deploy 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_deploy_1.1_eb83ca26a22c584b_.log 2024-08-06T20:57:54.4919562Z Running 0 items in this shard: 2024-08-06T20:57:54.4919738Z 2024-08-06T20:57:54.4921535Z Running profiler/test_cpp_thread 1/1 ... [2024-08-06 20:57:54.491808] 2024-08-06T20:57:54.4925058Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_cpp_thread.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:54.492139] 2024-08-06T20:57:59.5664128Z 2024-08-06T20:57:59.5665310Z profiler/test_cpp_thread 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_cpp_thread_1.1_2e29db9a2540ef70_.log 2024-08-06T20:57:59.5666373Z Running 0 items in this shard: 2024-08-06T20:57:59.5666641Z 2024-08-06T20:57:59.5667200Z Running dynamo/test_exceptions 1/1 ... [2024-08-06 20:57:59.566403] 2024-08-06T20:57:59.5670854Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_exceptions.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:57:59.566740] 2024-08-06T20:58:02.3369549Z 2024-08-06T20:58:02.3370407Z dynamo/test_exceptions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_exceptions_1.1_9ec46a182d72911c_.log 2024-08-06T20:58:02.3371640Z Running 0 items in this shard: 2024-08-06T20:58:02.3371840Z 2024-08-06T20:58:02.3372050Z Running test_fx_experimental 1/1 ... [2024-08-06 20:58:02.336876] 2024-08-06T20:58:02.3375195Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_experimental.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:58:02.337209] 2024-08-06T20:58:06.9096979Z 2024-08-06T20:58:06.9097989Z test_fx_experimental 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_experimental_1.1_c0ea3f9d3b2f79ce_.log 2024-08-06T20:58:06.9098801Z Running 0 items in this shard: 2024-08-06T20:58:06.9182715Z 2024-08-06T20:58:06.9183114Z Running test_nestedtensor 1/1 ... [2024-08-06 20:58:06.917905] 2024-08-06T20:58:06.9183791Z Running test_decomp 4/22 ... [2024-08-06 20:58:06.918001] 2024-08-06T20:58:06.9185324Z Running test_decomp 6/22 ... [2024-08-06 20:58:06.918246] 2024-08-06T20:58:06.9188528Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:58:06.918412] 2024-08-06T20:58:06.9190200Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=4', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:58:06.918461] 2024-08-06T20:58:06.9191712Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=6', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 20:58:06.918669] 2024-08-06T21:00:29.0682658Z 2024-08-06T21:00:29.0683709Z test_nestedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.1_dc5f1dd81e9c07b0_.log 2024-08-06T21:00:29.1323138Z Running 1416 items in this shard: test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_cat, test/test_nestedtensor.py::TestNestedTensor::test_copy_, test/test_nestedtensor.py::TestNestedTensor::test_default_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_dim, test/test_nestedtensor.py::TestNestedTensor::test_fill_, test/test_nestedtensor.py::TestNestedTensor::test_is_contiguous, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_ones_like, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_randn_like, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_zeros_like, test/test_nestedtensor.py::TestNestedTensor::test_nested_namespace, test/test_nestedtensor.py::TestNestedTensor::test_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_nested_tensor_matching_dim, test/test_nestedtensor.py::TestNestedTensor::test_numel, test/test_nestedtensor.py::TestNestedTensor::test_repr_string, test/test_nestedtensor.py::TestNestedTensor::test_size, test/test_nestedtensor.py::TestNestedTensor::test_size_dim, test/test_nestedtensor.py::TestNestedTensor::test_stride, test/test_nestedtensor.py::TestNestedTensor::test_to, test/test_nestedtensor.py::TestNestedTensor::test_to_padded_tensor_on_empty_tensor, test/test_nestedtensor.py::TestNestedTensor::test_unbind_0, test/test_nestedtensor.py::TestNestedTensor::test_unbind_1, test/test_nestedtensor.py::TestNestedTensor::test_unbind_3, test/test_nestedtensor.py::TestNestedTensor::test_unbind_4, test/test_nestedtensor.py::TestNestedTensor::test_unbind_dim, test/test_nestedtensor.py::TestNestedTensor::test_zero_, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_abs__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_abs_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_cos_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_gelu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_gelu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_logical_not_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_neg_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_relu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_relu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_sgn_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_silu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_silu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_sin_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_tanh__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_tanh_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_binary_ops_with_scalar_eq_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_binary_ops_with_scalar_ge_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_clone_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_contiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_contiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_device_checks_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_jagged_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_jagged_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_strided_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_breaking_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_breaking_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_with_bmm_path_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_with_bmm_path_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_256_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_256_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_384_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_384_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_8_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_8_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_div_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_div_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sum_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_4_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_then_from_padded_tensor_no_transform0213_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float64, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_abs_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_accumulate_grad_different_strides_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_as_nested_tensor_propagates_gradients_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_add_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_for_add_op_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_for_sub_op_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_sub_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_gelu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_indexing_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_32_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_edge_case_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1023_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1024_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_256_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_32_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_512_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_513_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_masked_fill_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_bmm_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_bmm_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_list_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_mask_and_to_padded_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_fused_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_generates_leaf_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_plus_transpose_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_reshape_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_reshape_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_softmax_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_squeeze_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_squeeze_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_to_padded_tensor_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_transpose_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_transpose_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_unsqueeze_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_unsqueeze_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_relu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_selu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_set_requires_grad_from_list_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_set_requires_grad_from_mask_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_split_with_sizes_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_to_buffer_series_ops_grad_with_broadcast_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_unbind_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_values_grad_with_broadcast_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_apply__cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_broadcasting_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_transposed_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_chunk_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_preserves_metadata_cache_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_min_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_propagated_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dummy_mha_with_nt_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flatten_decomp_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_contiguous_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_same_size_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_with_pinned_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layout_under_torch_dispatch_mode_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_empty_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_randn_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_ones_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_zeros_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_4_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_5_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_narrow_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_activation_checkpoint_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_fx_trace_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_pass_min_max_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_pass_min_max_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_njt_cat_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_pin_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_profiler_sequence_nr_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_False_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_True_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_specialize_dynamic_shape_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_specialize_dynamic_shape_recompile_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_with_sizes_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_squeeze_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_tensor_attributes_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_threshold_backward_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_copy_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_transposed_inputs_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_0_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_1_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_equals_2_bad_dim_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_last_dim_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unsafe_view_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_view_ragged_idx_not_one_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_views_inherit_ragged_dim_cuda, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_byte_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_count_nonzero_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_floor_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igammac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isneginf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isposinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isreal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_binary_return_by_ref_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_not_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_xor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_airy_ai_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_j1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_hermite_polynomial_he_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_laguerre_polynomial_l_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_legendre_polynomial_p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_scaled_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_scaled_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_byte_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_count_nonzero_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_floor_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_igammac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isneginf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isposinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isreal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_binary_return_by_ref_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_not_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_xor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_airy_ai_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_he_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_laguerre_polynomial_l_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_legendre_polynomial_p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_scaled_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_scaled_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_xlogy_cuda_float32 2024-08-06T21:00:29.1951098Z 2024-08-06T21:00:32.1557783Z Running test_decomp 7/22 ... [2024-08-06 21:00:32.155210] 2024-08-06T21:00:32.1559433Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=7', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:00:32.155573] 2024-08-06T21:06:58.6586925Z 2024-08-06T21:06:58.6588100Z test_decomp 6/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_6.22_7c53a7588da0f0bd_.log 2024-08-06T21:06:58.6710509Z Running 432 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive__native_batch_norm_legit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__softmax_backward_data_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igammac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_singular_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_leaky_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rms_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_number_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pinverse_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_masked_fill_cuda, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_dot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_expand_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_masked_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nan_to_num_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_igamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_igamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardswish_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_huber_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mish_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_prelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_var_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_bool 2024-08-06T21:06:58.6827532Z 2024-08-06T21:07:01.0656633Z 2024-08-06T21:07:01.0657919Z test_decomp 4/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_4.22_0cab0670d711450b_.log 2024-08-06T21:07:01.0848724Z Running 407 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__native_batch_norm_legit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_left_shift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_ex_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_log_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_bilinear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_prelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softplus_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pca_lowrank_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_hamming_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick__upsample_bilinear2d_aa_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_addcmul_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_hardswish_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_inf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_var_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_uniform_cuda 2024-08-06T21:07:01.1035411Z 2024-08-06T21:07:01.8825767Z Running test_decomp 17/22 ... [2024-08-06 21:07:01.881921] 2024-08-06T21:07:01.8827441Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=17', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:07:01.882306] 2024-08-06T21:07:04.1997594Z Running test_decomp 21/22 ... [2024-08-06 21:07:04.199198] 2024-08-06T21:07:04.2000540Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=21', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:07:04.199626] 2024-08-06T21:12:06.3787939Z 2024-08-06T21:12:06.3789164Z test_decomp 7/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_7.22_2d1d0e835d58ce77_.log 2024-08-06T21:12:06.3883157Z Running 339 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_left_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_ex_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_ex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_multi_dot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_batch_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rrelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_qr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_nuttall_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_left_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_clamp_min_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_fro_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_t_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unfold_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_polar_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_train_mode_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_rrelu_with_noise_cuda 2024-08-06T21:12:06.3973619Z 2024-08-06T21:12:09.5025108Z Running test_decomp 22/22 ... [2024-08-06 21:12:09.502003] 2024-08-06T21:12:09.5027741Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=22', '--num-shards=22', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:12:09.502374] 2024-08-06T21:12:57.4762865Z 2024-08-06T21:12:57.4763735Z test_decomp 21/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_21.22_52c3d44543a298c3_.log 2024-08-06T21:12:57.4881866Z Running 422 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__softmax_backward_data_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bincount_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_right_shift_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_complex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hypot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorinv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_batch_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_glu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rrelu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_qr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_mm_reduce_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_complex_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_complex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_complex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_native_layer_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_float64 2024-08-06T21:12:57.4994518Z 2024-08-06T21:13:00.6466555Z Running inductor/test_torchinductor_opinfo 8/16 ... [2024-08-06 21:13:00.646011] 2024-08-06T21:13:00.6467838Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=8', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:13:00.646384] 2024-08-06T21:14:00.3040520Z 2024-08-06T21:14:00.3041654Z test_decomp 17/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_17.22_7d7f397af1933552_.log 2024-08-06T21:14:00.3153779Z Running 392 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_right_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_inverse_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_complex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_householder_product_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_ex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_multi_dot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_multi_dot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vector_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_celu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_grid_sample_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rms_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_neg_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_blackman_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_cosine_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_real_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__batch_norm_with_update_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_native_dropout_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_triu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_vdot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_log_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_logsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_silu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int32, test/test_decomp.py::HasDecompTest::test_aten_core_operators 2024-08-06T21:14:00.3258793Z 2024-08-06T21:14:03.5055267Z Running functorch/test_ops 3/7 ... [2024-08-06 21:14:03.504934] 2024-08-06T21:14:03.5059254Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'not serial', '--shard-id=3', '--num-shards=7', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:14:03.505367] 2024-08-06T21:20:19.8752841Z 2024-08-06T21:20:19.8754093Z inductor/test_torchinductor_opinfo 8/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.16_48a5e37732f88614_.log 2024-08-06T21:20:19.8853398Z Running 217 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmatmul___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rxor___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bincount_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flatten_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_slogdet_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log1p_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_std_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_glu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_linear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_normalize_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_silu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_nuc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pinverse_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_neg_3_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_hamming_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_float32 2024-08-06T21:20:19.8944538Z 2024-08-06T21:20:23.1003890Z Running functorch/test_ops 4/7 ... [2024-08-06 21:20:23.099771] 2024-08-06T21:20:23.1005755Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'not serial', '--shard-id=4', '--num-shards=7', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:20:23.100186] 2024-08-06T21:20:28.5069092Z 2024-08-06T21:20:28.5070211Z test_decomp 22/22 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_22.22_018afb577caad2e6_.log 2024-08-06T21:20:28.5186029Z Running 418 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__batch_norm_with_update_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lcm_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_householder_product_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_singular_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_multinomial_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_linear_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pca_lowrank_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polar_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_quantile_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_bartlett_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_real_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_left_shift_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_stack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_gelu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_silu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_normal_number_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_int8, test/test_decomp.py::DecompOneOffTestsCUDA::test_sdpa_nn_functional_scaled_dot_product_attention_cuda_float32 2024-08-06T21:20:28.5297196Z 2024-08-06T21:20:31.7227506Z Running functorch/test_ops 7/7 ... [2024-08-06 21:20:31.722185] 2024-08-06T21:20:31.7229403Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'not serial', '--shard-id=7', '--num-shards=7', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:20:31.722563] 2024-08-06T21:21:41.7857691Z 2024-08-06T21:21:41.7858519Z functorch/test_ops 3/7 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_3.7_a7de25d0d05076bd_.log 2024-08-06T21:21:41.8345278Z Running 1396 items in this shard: test/functorch/test_ops.py::TestOperatorsCUDA::test_data_write_errors_under_transform_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_layer_norm_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_mse_loss_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_erfcx_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_T_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_diagonal_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_diagonal_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_hsplit_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_unbind_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_mT_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_reshape_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_squeeze_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_unfold_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_unsqueeze_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_resize__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyMulAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_SelectAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_T_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___getitem___functorch_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__segment_reduce_lengths_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_acos_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addcmul_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmm_decomposed_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_all_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_aminmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_arange_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_as_strided_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_asin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atan_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bernoulli_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bfloat16_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bool_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cat_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ceil_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_char_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clamp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clamp_max_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_column_stack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cummax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumsum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagonal_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fft2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ihfft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_irfft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_flatten_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_power_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_frexp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_full_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hstack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_fill_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_put_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_reduce_amin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isposinf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_det_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_inv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_matrix_power_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_norm_subgradients_at_zero_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logdet_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logspace_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logsumexp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_median_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_normalize_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_softmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_maximum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_minimum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_movedim_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_layer_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_max_pool1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_no_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_embedding_bag_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_gaussian_nll_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_gelu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_linear_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_margin_ranking_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_pool1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool2d_grad_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_prelu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_rrelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_soft_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softplus_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_unfold_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ones_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_positive_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_put_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_randint_like_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_randn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_randn_like_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_roll_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_3_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsqrt_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_sum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_select_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sgn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_bartlett_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_nuttall_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_slice_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_with_dtype_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_j0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_y0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_t_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_xlog1py_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tan_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trapz_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unbind_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unique_consecutive_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unsafe_split_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_as_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_zeros_like_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_erfcx_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_ZeroGradientsGenVmapAutogradFunction_cuda_float32 2024-08-06T21:21:41.8801961Z 2024-08-06T21:21:44.9446154Z Running test_ops 2/8 ... [2024-08-06 21:21:44.944055] 2024-08-06T21:21:44.9448850Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=2', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:21:44.944446] 2024-08-06T21:30:36.4407519Z 2024-08-06T21:30:36.4409003Z functorch/test_ops 4/7 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_4.7_68743ccfbd995ca9_.log 2024-08-06T21:30:36.4893954Z Running 1426 items in this shard: test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_tensor_with_scalar_list_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_T_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_broadcast_to_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_flatten_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_reshape_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_conj_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_conj_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_unfold_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_as_strided_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tensor_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_CubeGenVmapAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ScaleGradGenVmapAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___getitem___cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__chunk_cat_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_allclose_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_argwhere_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bucketize_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cartesian_prod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cauchy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cdouble_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cholesky_inverse_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cholesky_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_chunk_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_combinations_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_complex_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_count_nonzero_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cov_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cross_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_div_trunc_rounding_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_double_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_einsum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_eq_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_equal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_erf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_geometric_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_half_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hsplit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isneginf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eigh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lstsq_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_tensorsolve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vander_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vecdot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linspace_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumprod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_logaddexp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_prod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_var_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_meshgrid_variadic_tensors_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_min_reduction_with_dim_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mul_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_celu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_embedding_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_glu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_huber_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_instance_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_area_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_linear_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_pool3d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool1d_grad_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multi_head_attention_forward_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_negative_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pdist_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_smooth_l1_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softmin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softmin_with_dtype_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_threshold_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pow_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rad2deg_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_real_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_remainder_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resolve_neg_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scalar_tensor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_amin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_exponential_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sort_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_u_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_i1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_i1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_zeta_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_unbiased_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_t_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_topk_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trapezoid_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unfold_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unique_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___getitem___functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_local_response_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_full_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyMulAutogradFunction_cuda_float32 2024-08-06T21:30:36.5386496Z 2024-08-06T21:30:39.7189284Z Running test_ops 4/8 ... [2024-08-06 21:30:39.718310] 2024-08-06T21:30:39.7190611Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=4', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:30:39.718690] 2024-08-06T21:32:02.1991731Z 2024-08-06T21:32:02.1992979Z test_ops 2/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_2.8_f0a7bc2de991ae35_.log 2024-08-06T21:32:02.3208966Z Running 4029 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nansum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nonzero_static_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ones_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsafe_chunk_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_H_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_eq_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_flip_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log1p_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_alpha_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_square_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_std_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tril_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__upsample_bilinear2d_aa_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_acos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_asin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diff_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_half_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_histc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cond_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvals_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_householder_product_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_inv_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_ldl_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mvlgamma_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanmedian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nansum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_feature_alpha_dropout_with_train_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_grid_sample_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_trilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_logsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool1d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multi_head_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_replicate_negative_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_upsample_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_nuc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randn_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_repeat_interleave_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_decimals_neg_3_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_y1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_hermite_polynomial_h_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_i0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_polygamma_special_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_scaled_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_stft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch__scaled_mm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unravel_index_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_errors_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_errors_arange_cuda, test/test_ops.py::TestCommonCUDA::test_errors_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_errors_eye_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_errors_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_errors_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_errors_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_errors_item_cuda, test/test_ops.py::TestCommonCUDA::test_errors_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_errors_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_lstsq_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_gaussian_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_errors_trace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_cuda, test/test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rpow___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_interleave_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___radd___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cartesian_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_int_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_2inputs_2outputs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ldexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mH_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___radd___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rsub___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rsub___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mT_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_embedding_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zero__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_2inputs_2outputs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_warning_H_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmod___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rxor___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__chunk_cat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_asin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_floor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_istft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_svdvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log10_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__unsafe_masked_index_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_abs_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bool_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cov_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagflat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_double_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_half_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_histc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cholesky_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_householder_product_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_inv_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_rank_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_qr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_unpack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_min_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mode_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_embedding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_reflect_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_replicate_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_replicate_negative_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_upsample_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_norm_inf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pca_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pinverse_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_put_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randn_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resize__cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resolve_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_decimals_0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_slice_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_square_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unravel_index_cuda, test/test_ops.py::TestCommonCUDA::test_out_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igammac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_gcd_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_where_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_complex_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frac_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rpow___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matrix_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_linear_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_fro_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapz_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vstack_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_arange_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expand_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_where_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view___radd___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rmatmul___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rmul___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addcdiv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_broadcast_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_contiguous_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_embed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flipud_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_index_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_istft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_permute_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_randn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_var_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_decomposed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_angle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_argwhere_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_corrcoef_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagonal_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_full_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_gather_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ldexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_qr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_svdvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mT_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_std_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_neg_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_linear_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_positive_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_pow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_repeat_interleave_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reshape_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_slice_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unflatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unsqueeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_zeros_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_char_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_half_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_equal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_and_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_xor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_bfloat16_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diag_embed_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_hfftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isnan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_istft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_inv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_qr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logdet_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logical_or_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_movedim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_fro_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ormqr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pca_lowrank_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_renorm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_repeat_interleave_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize_as__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sgn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_short_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sigmoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_mean_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trapezoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trapz_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsqueeze_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_where_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmatmul___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rpow___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__batch_norm_with_update_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_byte_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_short_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_digamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_floor_rounding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_erfinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_flatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_floor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isnan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isneginf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_renorm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_vdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_abs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_aminmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argwhere_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bucketize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_byte_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cauchy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_permuted_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expm1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_exponential_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_floor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_floor_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_geqrf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isposinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cholesky_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cholesky_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_det_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_householder_product_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_inv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lstsq_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_multi_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_tensorsolve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_long_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_argmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_argmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_matrix_exp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_maximum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_zeros_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_ctc_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_glu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardswish_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_inf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_renorm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_reshape_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resize_as__cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_roll_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_neg_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rsqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_exponential_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sparse_sampled_addmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_with_sizes_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tensor_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_uniform_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zeros_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___ror___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_allclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bool_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_byte_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_and_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_in_place_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_or_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isreal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lcm_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hann_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_xor_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_permuted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_equal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_igamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_native_dropout_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags___radd___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rpow___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_int_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_polar_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_asinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_or_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cos_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diag_embed_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_digamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_equal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isneginf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_maximum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unflatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_acosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addbmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcmul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_allclose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_aminmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_arange_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_asin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_asinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atan2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_column_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_conj_physical_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_digamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_div_floor_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_div_trunc_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_double_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flip_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_heaviside_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_inner_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cond_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lstsq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_matrix_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_median_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_movedim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanmedian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_selu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nonzero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_inf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_outer_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rad2deg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randint_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_remainder_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_short_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_cosine_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sinc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_j1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trapz_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsqueeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_complex_cuda_float32 2024-08-06T21:32:02.4375225Z 2024-08-06T21:32:05.3207972Z Running test_ops 6/8 ... [2024-08-06 21:32:05.320180] 2024-08-06T21:32:05.3209124Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=6', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:32:05.320544] 2024-08-06T21:32:52.4280038Z 2024-08-06T21:32:52.4280823Z functorch/test_ops 7/7 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_7.7_30bacd547c101322_.log 2024-08-06T21:32:52.4793026Z Running 1425 items in this shard: test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_char_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_jiterator_unary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv2d_strided_padding_dilation_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_max_unpool1d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_conj_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_hsplit_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_split_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_split_list_args_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_mH_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_neg_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_squeeze_multiple_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_transpose_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_unsqueeze_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_norm_subgradients_at_zero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_erfcx_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_byte_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_rrelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ihfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_slogdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_min_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpySortAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyTakeAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rmatmul___cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rsub___cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__unsafe_masked_index_put_accumulate_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_add_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_angle_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_as_strided_partial_views_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_asinh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atan2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cdouble_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumulative_trapezoid_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diag_embed_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagonal_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diff_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_div_floor_rounding_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dstack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_empty_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fftshift_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_irfftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_floor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_full_like_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gt_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_select_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_4inputs_with_extra_args_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_binary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_ex_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cross_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_diagonal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_factor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lu_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_matrix_rank_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_svdvals_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vector_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log1p_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_xor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logspace_tensor_overload_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumsum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_softmin_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_matmul_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_pool2d_with_indices_backward_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_reduction_no_dim_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mvlgamma_mvlgamma_p_1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mvlgamma_mvlgamma_p_5_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nansum_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_dropout_backward_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_empty_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_new_empty_strided_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_groups_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_padding_no_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cross_entropy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_ctc_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_grid_sample_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardshrink_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_bicubic_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_layer_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_logsigmoid_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mse_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mse_loss_functorch_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_normalize_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_circular_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pixel_unshuffle_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_rms_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_silu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softsign_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_triplet_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_normal_in_place_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_normal_number_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pca_lowrank_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polygamma_polygamma_n_0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_prod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ravel_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_repeat_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_reshape_as_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resize_as__cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_prod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_searchsorted_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_select_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_cosine_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_kaiser_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signbit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sinh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_mm_reduce_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_j1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_ndtri_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_with_sizes_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sum_to_size_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_take_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unfold_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_as_complex_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vstack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__batch_norm_with_update_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_adaptive_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_cond_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_max_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_rsub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_log_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_dist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_svd_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_to_sparse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_triangular_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_partial_views_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_erfinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_clamp_min_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_householder_product_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_binary_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_poisson_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tensor_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_jiterator_4inputs_with_extra_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_kron_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_minimum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_positive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_bessel_j1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_shifted_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_xlog1py_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_solve_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_native_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_new_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_pca_lowrank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_polygamma_polygamma_n_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_rot90_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_gaussian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unsqueeze_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_ScaleGradGenVmapAutogradFunction_cuda_float32 2024-08-06T21:32:52.5263707Z 2024-08-06T21:32:55.5938627Z Running inductor/test_aot_inductor 7/16 ... [2024-08-06 21:32:55.593298] 2024-08-06T21:32:55.5940740Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=7', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:32:55.593691] 2024-08-06T21:40:11.9457419Z 2024-08-06T21:40:11.9458889Z test_ops 4/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_4.8_6e12e65be181d4cf_.log 2024-08-06T21:40:12.1066115Z Running 4023 items in this shard: test/test_ops.py::TestSelfKwarg::test_self_kwargs, test/test_ops.py::TestCommonCUDA::test_compare_cpu_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_pca_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_dsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_put_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nonzero_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_resolve_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_scalar_tensor_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__chunk_cat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_long_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_asin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_erf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_imag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_relu6_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_threshold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_positive_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_real_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_offsets_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__unsafe_masked_index_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_abs_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_angle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argwhere_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_baddbmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bincount_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_corrcoef_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagflat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_double_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_exp2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_exp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_eye_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_flip_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ge_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_item_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_ldl_factor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_ldl_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_pinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_pinv_hermitian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_pinv_singular_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log10_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_unpack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_median_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ne_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_neg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv_transpose1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_embedding_bag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multi_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_one_hot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_scaled_dot_product_attention_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nonzero_static_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_positive_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_pow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randint_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resize_as__cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_decimals_0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_blackman_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_gaussian_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_y0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_v_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_svd_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch_ops_aten__safe_softmax_default_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_errors_cat_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diff_cuda, test/test_ops.py::TestCommonCUDA::test_errors_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gt_cuda, test/test_ops.py::TestCommonCUDA::test_errors_jiterator_binary_return_by_ref_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_errors_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_errors_polar_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_blackman_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_errors_triu_cuda, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rsub___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cov_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gather_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_kron_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_embedding_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_constant_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_topk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zero__cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_zeros_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_H_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rdiv___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___ror___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cummax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_4inputs_with_extra_args_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_channel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pad_circular_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_3_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scatter_reduce_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_k0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_scaled_modified_bessel_k1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_split_list_args_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_to_sparse_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___radd___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__chunk_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_reflect_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rand_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_cat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorsolve_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorsolve_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_gaussian_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_out_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_slogdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nansum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ormqr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning___getitem___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___radd___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_block_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_conj_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_channel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_roll_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__upsample_bilinear2d_aa_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_angle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bernoulli_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_byte_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_corrcoef_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cummin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diff_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flip_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_float_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_frac_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_binary_return_by_ref_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lstsq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log10_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mH_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_median_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_batch_norm_without_cudnn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_binary_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_ctc_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_instance_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_linear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_trilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool3d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softsign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polar_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_qr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_quantile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ravel_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sinc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_scaled_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_svd_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tril_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_mean_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_mean_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_eye_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_le_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_trace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_tril_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_shapes_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_alpha_dropout_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_alpha_dropout_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_number_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_istft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argwhere_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cov_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagflat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gradient_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eig_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nansum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_reflect_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_outer_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pca_lowrank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pinverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize_as__cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zero__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_cuda_complex64, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_renorm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_baddbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isreal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nextafter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diag_embed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isreal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_permuted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_rfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_det_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_mish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resize_as__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_zeros_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view_T_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rpow___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addcmul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_exp2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_exp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expand_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_expm1_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fliplr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isclose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_full_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_rsub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_std_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_triu_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unflatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_vstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_acosh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addcmul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_baddbmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_block_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_broadcast_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cdouble_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cholesky_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cholesky_inverse_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_combinations_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diag_embed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_permuted_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_float_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_hstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isclose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_item_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_det_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lu_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_sum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_movedim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_nuc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_normal_in_place_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_outer_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_qr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resize__cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resolve_neg_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rot90_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scalar_tensor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_square_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_std_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_t_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unbind_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unsafe_split_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rdiv___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmatmul___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_int_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_addcmul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atanh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diagonal_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_hsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_istft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_item_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_masked_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_movedim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_mul_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rsqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_square_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_t_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_triu_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_where_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addbmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addmm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addmv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_any_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_block_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_combinations_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_eq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expand_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_eye_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_hfft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_factor_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_solve_triangular_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_svdvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vecdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_lu_unpack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nansum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_narrow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_reflect_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_silu_complex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_qr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rsqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_square_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_take_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tanh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensordot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tile_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tril_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_uniform_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rdiv___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmul___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_T_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_polar_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_clamp_max_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_column_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_equal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_erf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fmod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_frac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log1p_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_not_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_maximum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nan_to_num_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nextafter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_roll_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rsqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_square_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__segment_reduce_offsets_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__upsample_bilinear2d_aa_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addcdiv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_any_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bernoulli_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bool_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cdouble_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cfloat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_inverse_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_contiguous_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagflat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diff_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erfc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erfinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expand_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_fftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fmod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_full_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_igammac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_int_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cross_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_det_singular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eigh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_qr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lu_unpack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_log_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_median_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_msort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_narrow_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_celu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_linear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softsign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_unfold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_repeat_interleave_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rot90_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_searchsorted_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_airy_ai_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_entr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_hermite_polynomial_h_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_with_sizes_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_square_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_t_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_to_sparse_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_triangular_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_triu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trunc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unique_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_xlogy_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bernoulli_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_permuted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_igamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_imag_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_istft_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bernoulli_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_xor_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_bmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_corrcoef_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erfinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_log_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gather_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_imag_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_int_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resize__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_y1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fliplr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gcd_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_geometric_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isreal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_istft_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_le_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mT_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_silu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_j1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int8, test/test_ops.py::TestTagsCUDA::test_tags___rdiv___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_T_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_abs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_acosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_asin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atan2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cauchy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_min_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_deg2rad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erfc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expand_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expand_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_gcd_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_ge_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isfinite_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isposinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_item_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_le_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log1p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_xor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_minimum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_narrow_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ne_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_normal__in_place_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_permute_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_randn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_renorm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_i1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_zeta_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tril_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_view_as_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_alias_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atleast_2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_or_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_broadcast_shapes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_deg2rad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flipud_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_float_power_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_floor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_geqrf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_hypot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lerp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eig_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_slogdet_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svdvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_or_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_long_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_softmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_var_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_glu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_relu6_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ormqr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_permute_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_renorm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_repeat_interleave_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_reshape_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resize_as__cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_roll_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signbit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_airy_ai_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_with_sizes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_squeeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tensordot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tile_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_to_sparse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_topk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_transpose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tril_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_true_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_where_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_xlogy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zeros_like_cuda_float32 2024-08-06T21:40:12.2208053Z 2024-08-06T21:40:15.1741841Z Running inductor/test_aot_inductor 9/16 ... [2024-08-06 21:40:15.173602] 2024-08-06T21:40:15.1744244Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=9', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:40:15.174023] 2024-08-06T21:41:09.3088609Z 2024-08-06T21:41:09.3089509Z test_ops 6/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_6.8_a9d9dc6c489cdbf0_.log 2024-08-06T21:41:09.4418692Z Running 4023 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_H_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rand_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_slice_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_T_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rpow___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bool_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_byte_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_int_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_abs_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log10_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_lengths_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__softmax_backward_data_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addbmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmm_decomposed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cummin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_einsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_imag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_det_singular_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_inv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_multi_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mT_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_batch_norm_without_cudnn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardswish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_linear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_local_response_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multilabel_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_constant_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_replicate_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_silu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softsign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polar_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_airy_ai_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_list_args_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_topk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trapz_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unique_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsafe_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_errors___ror___cuda, test/test_ops.py::TestCommonCUDA::test_errors_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_empty_permuted_cuda, test/test_ops.py::TestCommonCUDA::test_errors_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_errors_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ge_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_errors_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_errors_neg_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_errors_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_take_cuda, test/test_ops.py::TestCommonCUDA::test_errors_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_multiple_devices_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_unary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mT_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_median_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_outer_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cummin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagflat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_put_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_softsign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nonzero_static_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize_as__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_slice_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_consecutive_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kthvalue_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_out___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestCommonCUDA::test_out_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning_T_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_char_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_imag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal__in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_multigammaln_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argwhere_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_eq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_equal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_erf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_4inputs_with_extra_args_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_le_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eig_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_triangular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_matmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_median_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_msort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_binary_cross_entropy_with_logits_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_area_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_circular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_repeat_interleave_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_zero__cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_einsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_slogdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_det_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_as_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__chunk_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_acos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eye_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_hstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_xor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sigmoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tril_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_alias_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_clone_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_column_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagflat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_inner_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_int_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_det_singular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_factor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_narrow_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_fro_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reciprocal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_repeat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reshape_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_list_args_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_transpose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trapezoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmul___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rpow___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_double_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_acosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_asin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_conj_physical_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expm1_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isfinite_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isreal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_permute_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_randn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_all_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cartesian_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chalf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_char_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_inverse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_column_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cov_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagflat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diff_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_inv_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_multi_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log10_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_circular_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softsign_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_positive_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_triangular_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zero__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view_T_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___radd___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rsub___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_int_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_all_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_allclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atan2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_item_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lerp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rot90_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rsub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_select_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_entr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_zeta_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_t_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_trunc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__softmax_backward_data_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_baddbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cummax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gradient_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_histc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_inner_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isneginf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_kron_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cond_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_triangular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_or_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nan_to_num_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanmedian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanquantile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nansum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nextafter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_gelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_kl_div_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_normalize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_rms_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_silu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_upsample_nearest_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_put_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randint_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resolve_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_blackman_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i0e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_take_along_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trapz_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_mean_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_where_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_randn_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signbit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_or_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rand___cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags___rmul___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__batch_norm_with_update_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_alias_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_arange_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_block_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clone_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_column_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_contiguous_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_copysign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exponential_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_eye_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fmod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_frac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isnan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_masked_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nan_to_num_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_pow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_remainder_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_roll_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sgn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_entr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtri_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sum_to_size_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tensor_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_triu_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_unbind_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__softmax_backward_data_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_abs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cartesian_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagflat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_floor_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_geometric_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gradient_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isfinite_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isposinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_unary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_le_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_det_singular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vector_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_not_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lu_unpack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_dropout_backward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_elu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_number_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pca_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_real_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_reciprocal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resize__cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hann_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_take_along_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tril_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unique_consecutive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_real_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_view_copy_cuda_float32 2024-08-06T21:41:09.5614911Z 2024-08-06T21:41:12.5350210Z Running inductor/test_aot_inductor 11/16 ... [2024-08-06 21:41:12.534354] 2024-08-06T21:41:12.5352201Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=11', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:41:12.534760] 2024-08-06T21:49:36.3548027Z 2024-08-06T21:49:36.3549660Z inductor/test_aot_inductor 11/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_11.16_e9a072ac5423a2ba_.log 2024-08-06T21:49:36.3603622Z Running 60 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_deconv_freezing_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_cat_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fft_c2c_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_2_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_quantized_linear_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_assert_async_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_freezing_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_multi_device_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_default_cuda_device_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_profiler_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_cond_with_parameters_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_folding_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dup_unbacked_sym_decl_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_missing_cubin_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_while_loop_with_outer_code_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_addmm_multiple_dynamic_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_aliased_buffer_reuse_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_bool_input_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_buffer_mutation_3_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fft_c2c_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fqn_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_non_default_cuda_device_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_seq_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_zero_grid_with_backed_symbols_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_addmm_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_aliased_buffer_reuse_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_buffer_mutation_2_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_constant_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_duplicated_params_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_multi_device_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_scaled_dot_product_efficient_attention_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_extern_kernel_arg_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_sympy_expr_arg_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_cond_non_tensor_predicates_dynamic_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_cond_with_multiple_outputs_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_convolution_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_custom_op_add_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fft_c2c_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_normal_functional_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_output_path_2_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_return_view_constant_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_reuse_kernel_dynamic_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_run_with_grad_enabled_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_zero_grid_with_backed_symbols_non_abi_compatible_cuda 2024-08-06T21:49:36.3656171Z 2024-08-06T21:49:39.5389788Z Running inductor/test_torchinductor_dynamic_shapes 2/6 ... [2024-08-06 21:49:39.538483] 2024-08-06T21:49:39.5393230Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'not serial', '--shard-id=2', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:49:39.538891] 2024-08-06T21:57:59.9729609Z 2024-08-06T21:57:59.9730471Z PRINTING LOG FILE of inductor/test_aot_inductor 7/16 (test/test-reports/inductor.test_aot_inductor_7.16_e50ead3b82712b34_.log) 2024-08-06T21:57:59.9731748Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-e514922bde3007a9.xml 2024-08-06T21:57:59.9733019Z ============================= test session starts ============================== 2024-08-06T21:57:59.9733753Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T21:57:59.9734535Z cachedir: .pytest_cache 2024-08-06T21:57:59.9735304Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T21:57:59.9736086Z rootdir: /var/lib/jenkins/workspace 2024-08-06T21:57:59.9736965Z configfile: pytest.ini 2024-08-06T21:57:59.9737525Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T21:57:59.9738137Z collecting ... collected 912 items 2024-08-06T21:57:59.9738483Z stepcurrent: Cannot find last run test, not skipping 2024-08-06T21:57:59.9774171Z Running 58 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_original_fqn_and_dtype_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_smem_above_default_limit_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fx_gm_return_tuple_validation_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_missing_output_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_tensor_input_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_view_outputs_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_while_loop_simple_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_dynamic_smem_above_default_limit_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fake_tensor_device_validation_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_missing_cubin_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_zero_size_weight_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_buffer_mutation_4_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_int_list_input_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_misc_1_max_autotune_True_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_model_modified_weights_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_output_path_1_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_quantized_linear_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_reuse_kernel_dynamic_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_fp8_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_zero_size_weight_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_2_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_int_list_input_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_nested_tensor_from_jagged_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_scatter_reduce_fallback_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_equal_to_1_float_arg_dynamic_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda 2024-08-06T21:57:59.9808550Z 2024-08-06T21:57:59.9809582Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.2944s] [ 1%] 2024-08-06T21:57:59.9811123Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.8797s] [ 3%] 2024-08-06T21:57:59.9812636Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.5314s] [ 5%] 2024-08-06T21:57:59.9814494Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.3888s] [ 6%] 2024-08-06T21:57:59.9815990Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.3196s] [ 8%] 2024-08-06T21:57:59.9817468Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 10%] 2024-08-06T21:57:59.9818943Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0806 21:33:41.403000 12584 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-06T21:57:59.9819939Z PASSED [9.8615s] [ 12%] 2024-08-06T21:57:59.9820867Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.1455s] [ 13%] 2024-08-06T21:57:59.9822319Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.4146s] [ 15%] 2024-08-06T21:57:59.9823829Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.1050s] [ 17%] 2024-08-06T21:57:59.9825283Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 18%] 2024-08-06T21:57:59.9826808Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 20%] 2024-08-06T21:57:59.9828304Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cpu SKIPPED [0.0029s] (requires CUDA) [ 22%] 2024-08-06T21:57:59.9830206Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.4885s] [ 24%] 2024-08-06T21:57:59.9832048Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 25%] 2024-08-06T21:57:59.9833894Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [11.4691s] [ 27%] 2024-08-06T21:57:59.9835718Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [8.3920s] [ 29%] 2024-08-06T21:57:59.9837593Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 31%] 2024-08-06T21:57:59.9839373Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 32%] 2024-08-06T21:57:59.9841295Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 34%] 2024-08-06T21:57:59.9843283Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 36%] 2024-08-06T21:57:59.9845230Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0035s] (requires CUDA) [ 37%] 2024-08-06T21:57:59.9847214Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0030s] (requires CUDA) [ 39%] 2024-08-06T21:57:59.9849227Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0034s] (requires CUDA) [ 41%] 2024-08-06T21:57:59.9851404Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 43%] 2024-08-06T21:57:59.9853827Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_original_fqn_and_dtype_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [15.6181s] [ 44%] 2024-08-06T21:57:59.9856634Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_smem_above_default_limit_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 46%] 2024-08-06T21:57:59.9859201Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fx_gm_return_tuple_validation_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [0.0118s] [ 48%] 2024-08-06T21:57:59.9861814Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_missing_output_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-06T21:57:59.9864222Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_tensor_input_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py W0806 21:34:54.918000 12584 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:57:59.9865815Z W0806 21:34:54.930000 12584 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:57:59.9866500Z W0806 21:35:08.870000 12584 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:57:59.9867118Z W0806 21:35:08.882000 12584 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:57:59.9867642Z PASSED [29.5747s] [ 51%] 2024-08-06T21:57:59.9869161Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0029s] (requires CUDA) [ 53%] 2024-08-06T21:57:59.9871666Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0027s] (requires CUDA) [ 55%] 2024-08-06T21:57:59.9874174Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_view_outputs_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.1344s] [ 56%] 2024-08-06T21:57:59.9876630Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_while_loop_simple_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 58%] 2024-08-06T21:57:59.9879035Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cuda W0806 21:35:31.643000 12584 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-06T21:57:59.9881102Z W0806 21:35:31.651000 12584 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-06T21:57:59.9882097Z PASSED [8.4128s] [ 60%] 2024-08-06T21:57:59.9883199Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_dynamic_smem_above_default_limit_abi_compatible_cuda SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 62%] 2024-08-06T21:57:59.9884854Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fake_tensor_device_validation_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [0.0387s] [ 63%] 2024-08-06T21:57:59.9886359Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_missing_cubin_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [30.8857s] [ 65%] 2024-08-06T21:57:59.9889486Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_zero_size_weight_abi_compatible_cuda <- test/inductor/test_torchinductor.py /tmp/tmpfwi9a6fz/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/cx6q7sbtzlgxdrfr66iocrggnouhhxh4xkllq6zw4frla4vbmrpc.cpp: In member function ‘void torch::aot_inductor::AOTInductorModel::run_impl(AtenTensorOpaque**, AtenTensorOpaque**, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle)’: 2024-08-06T21:57:59.9892984Z /tmp/tmpfwi9a6fz/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/cx6q7sbtzlgxdrfr66iocrggnouhhxh4xkllq6zw4frla4vbmrpc.cpp:603:10: warning: variable ‘L__self___net_0_weight’ set but not used [-Wunused-but-set-variable] 2024-08-06T21:57:59.9894374Z 603 | auto L__self___net_0_weight = constants_->at(0); 2024-08-06T21:57:59.9894727Z | ^~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9895287Z PASSED [7.9352s] [ 67%] 2024-08-06T21:57:59.9896388Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_buffer_mutation_4_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0033s] (requires CUDA) [ 68%] 2024-08-06T21:57:59.9897949Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_int_list_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.5298s] [ 70%] 2024-08-06T21:57:59.9899373Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_misc_1_max_autotune_True_non_abi_compatible_cpu E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-06T21:57:59.9900434Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:57:59.9900897Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-06T21:57:59.9905053Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.so 2024-08-06T21:57:59.9909321Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:57:59.9909781Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-06T21:57:59.9911230Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:57:59.9913616Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:57:59.9915080Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:57:59.9915836Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-06T21:57:59.9917538Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:57:59.9919871Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:57:59.9921450Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:57:59.9922273Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9923800Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:57:59.9925795Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:57:59.9927684Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:57:59.9928923Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] 208 | AOTI_TORCH_CHECK( 2024-08-06T21:57:59.9929543Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9930129Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:57:59.9931794Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:57:59.9933592Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:242:25: required from here 2024-08-06T21:57:59.9935531Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:57:59.9936833Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:57:59.9937639Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9938351Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:57:59.9940051Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:57:59.9941848Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:255:25: required from here 2024-08-06T21:57:59.9943769Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:57:59.9945060Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:57:59.9945908Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9946619Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:57:59.9947703Z E0806 21:36:36.282000 12584 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-06T21:57:59.9948829Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-06T21:57:59.9949442Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-06T21:57:59.9949941Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:57:59.9950396Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-06T21:57:59.9954684Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.so 2024-08-06T21:57:59.9959025Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:57:59.9959492Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-06T21:57:59.9960933Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:57:59.9963321Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:57:59.9964776Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:57:59.9965470Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-06T21:57:59.9966980Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:57:59.9969431Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:57:59.9970968Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:57:59.9971724Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9973236Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:57:59.9975507Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:57:59.9977460Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:57:59.9978626Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] 208 | AOTI_TORCH_CHECK( 2024-08-06T21:57:59.9979244Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9979906Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:57:59.9981571Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:57:59.9983374Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:242:25: required from here 2024-08-06T21:57:59.9985066Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:57:59.9986358Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:57:59.9987165Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9987867Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:57:59.9989573Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:57:59.9991357Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:255:25: required from here 2024-08-06T21:57:59.9993645Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/aq/caqouen3lmvyllrahj4gkvehm6ygngmm3iv2uaff3lpaqtcbi3ab.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:57:59.9994955Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:57:59.9995976Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:57:59.9996687Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:57:59.9997243Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] . 2024-08-06T21:57:59.9997753Z E0806 21:36:37.967000 12584 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-06T21:57:59.9998356Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-06T21:57:59.9998888Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:57:59.9999422Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-06T21:58:00.0003570Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.so 2024-08-06T21:58:00.0007975Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0008437Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-06T21:58:00.0009922Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0012315Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0013781Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0014677Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0016189Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0018512Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0020051Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0020811Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0022497Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0024492Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0026391Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0027622Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0028230Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0028828Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0030486Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0032351Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:198:25: required from here 2024-08-06T21:58:00.0034046Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0035351Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0036204Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0036918Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0038607Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0040401Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:211:25: required from here 2024-08-06T21:58:00.0042099Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0043399Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0044201Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0044909Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0046002Z E0806 21:36:39.578000 12584 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-06T21:58:00.0047275Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-06T21:58:00.0047872Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-06T21:58:00.0048364Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0048821Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-06T21:58:00.0052955Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.so 2024-08-06T21:58:00.0057587Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0058044Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-06T21:58:00.0059506Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0061882Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0063338Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0064027Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0065534Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0067918Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0069451Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0070214Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0071714Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0073886Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0075785Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0076959Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0077578Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0078160Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0079892Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0081682Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:198:25: required from here 2024-08-06T21:58:00.0083375Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0084733Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0085554Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0086309Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0088011Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0089787Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:211:25: required from here 2024-08-06T21:58:00.0091484Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/bi/cbiytpk2ihr5tmxq5wmd6dsqrlbwxqsnfazpx7e5tvlvrrbxncok.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0093376Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0094190Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0094993Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0095551Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] . 2024-08-06T21:58:00.0096121Z E0806 21:36:41.232000 12584 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-06T21:58:00.0096719Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-06T21:58:00.0097250Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0097708Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-06T21:58:00.0102186Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.so 2024-08-06T21:58:00.0106593Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0107058Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-06T21:58:00.0108573Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0111057Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0112509Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0113212Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0114702Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0117067Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0118607Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0119375Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0120879Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0122854Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0124738Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0125956Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0126755Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0127340Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0128993Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0130782Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:198:25: required from here 2024-08-06T21:58:00.0132544Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0133833Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0134872Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0135632Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0137491Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0139280Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:211:25: required from here 2024-08-06T21:58:00.0140950Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0142238Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0143044Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0143755Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0144842Z E0806 21:36:42.856000 12584 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-06T21:58:00.0145917Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-06T21:58:00.0146519Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-06T21:58:00.0147015Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0147467Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-06T21:58:00.0151738Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.so 2024-08-06T21:58:00.0156004Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0156542Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-06T21:58:00.0157981Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0160338Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0161788Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0162680Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0164197Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0166553Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0168072Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0168839Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0170343Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0187722Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0190003Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0191195Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0191817Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0193183Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0195359Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0197155Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:198:25: required from here 2024-08-06T21:58:00.0198858Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0200257Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0201063Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0201774Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0203480Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0205350Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:211:25: required from here 2024-08-06T21:58:00.0207053Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] /tmp/tmp050h3jje/vi/cvicnm2z7trphjjovw23vh2w2tyt5lhf2vujihovxbrh46plmt6e.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0208333Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0209135Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0209844Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0210405Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] . 2024-08-06T21:58:00.0210912Z E0806 21:36:44.517000 12584 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-06T21:58:00.0211415Z PASSED [25.9813s] [ 72%] 2024-08-06T21:58:00.0212463Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_model_modified_weights_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.9718s] [ 74%] 2024-08-06T21:58:00.0213980Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_output_path_1_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.5883s] [ 75%] 2024-08-06T21:58:00.0215850Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_quantized_linear_non_abi_compatible_cpu [W806 21:37:32.563906609 QuantizedLinear.cpp:383] Warning: fbgemm_pack_gemm_matrix_fp16 is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-06T21:58:00.0217412Z [W806 21:37:46.440345365 QuantizedLinear.cpp:418] Warning: fbgemm_linear_fp16_weight_fp32_activation is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-06T21:58:00.0218220Z PASSED [14.9233s] [ 77%] 2024-08-06T21:58:00.0219365Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_reuse_kernel_dynamic_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [17.2506s] [ 79%] 2024-08-06T21:58:00.0220857Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_fp8_non_abi_compatible_cpu SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 81%] 2024-08-06T21:58:00.0222461Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0033s] (requires CUDA) [ 82%] 2024-08-06T21:58:00.0224164Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_non_abi_compatible_cpu SKIPPED [0.0030s] (requires CUDA) [ 84%] 2024-08-06T21:58:00.0225846Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_zero_size_weight_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.5938s] [ 86%] 2024-08-06T21:58:00.0227362Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_2_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.5103s] [ 87%] 2024-08-06T21:58:00.0228871Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_int_list_input_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.2229s] [ 89%] 2024-08-06T21:58:00.0230414Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_nested_tensor_from_jagged_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.7658s] [ 91%] 2024-08-06T21:58:00.0232057Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_scatter_reduce_fallback_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.3363s] [ 93%] 2024-08-06T21:58:00.0233557Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_equal_to_1_float_arg_dynamic_True_non_abi_compatible_cuda PASSED [16.5652s] [ 94%] 2024-08-06T21:58:00.0235048Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_non_abi_compatible_cuda PASSED [21.5675s] [ 96%] 2024-08-06T21:58:00.0236595Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_non_abi_compatible_cuda PASSED [16.5588s] [ 98%] 2024-08-06T21:58:00.0238002Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda PASSED [0.2943s] [100%] 2024-08-06T21:58:00.0238638Z 2024-08-06T21:58:00.0239209Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-e514922bde3007a9.xml - 2024-08-06T21:58:00.0240244Z ================== 35 passed, 23 skipped in 438.75s (0:07:18) ================== 2024-08-06T21:58:00.0240694Z Got exit code -11 (SIGSEGV) 2024-08-06T21:58:00.0240954Z Retrying single test... 2024-08-06T21:58:00.0241523Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-a62140c89544360f.xml 2024-08-06T21:58:00.0242268Z ============================= test session starts ============================== 2024-08-06T21:58:00.0242799Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T21:58:00.0243250Z cachedir: .pytest_cache 2024-08-06T21:58:00.0243795Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T21:58:00.0244381Z rootdir: /var/lib/jenkins/workspace 2024-08-06T21:58:00.0244656Z configfile: pytest.ini 2024-08-06T21:58:00.0245227Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T21:58:00.0245796Z collecting ... collected 912 items 2024-08-06T21:58:00.0246139Z stepcurrent: Cannot find last run test, not skipping 2024-08-06T21:58:00.0246483Z Running 58 items in this shard 2024-08-06T21:58:00.0246652Z 2024-08-06T21:58:00.0248909Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.9836s] [ 1%] 2024-08-06T21:58:00.0250432Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.0055s] [ 3%] 2024-08-06T21:58:00.0251985Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.3979s] [ 5%] 2024-08-06T21:58:00.0253453Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4485s] [ 6%] 2024-08-06T21:58:00.0255076Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.2149s] [ 8%] 2024-08-06T21:58:00.0256593Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0031s] (requires CUDA) [ 10%] 2024-08-06T21:58:00.0258127Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0806 21:46:07.725000 16396 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-06T21:58:00.0259113Z PASSED [9.9212s] [ 12%] 2024-08-06T21:58:00.0260033Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.2257s] [ 13%] 2024-08-06T21:58:00.0261477Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.4237s] [ 15%] 2024-08-06T21:58:00.0262923Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.2466s] [ 17%] 2024-08-06T21:58:00.0264388Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 18%] 2024-08-06T21:58:00.0265967Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 20%] 2024-08-06T21:58:00.0267469Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 22%] 2024-08-06T21:58:00.0269067Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.1741s] [ 24%] 2024-08-06T21:58:00.0270892Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 25%] 2024-08-06T21:58:00.0272752Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [11.1191s] [ 27%] 2024-08-06T21:58:00.0274712Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [8.2871s] [ 29%] 2024-08-06T21:58:00.0276509Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 31%] 2024-08-06T21:58:00.0278285Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 32%] 2024-08-06T21:58:00.0280304Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 34%] 2024-08-06T21:58:00.0282230Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 36%] 2024-08-06T21:58:00.0284167Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0034s] (requires CUDA) [ 37%] 2024-08-06T21:58:00.0286255Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0030s] (requires CUDA) [ 39%] 2024-08-06T21:58:00.0288281Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0037s] (requires CUDA) [ 41%] 2024-08-06T21:58:00.0290449Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 43%] 2024-08-06T21:58:00.0293444Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_original_fqn_and_dtype_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [15.5489s] [ 44%] 2024-08-06T21:58:00.0296255Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_smem_above_default_limit_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 46%] 2024-08-06T21:58:00.0298816Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fx_gm_return_tuple_validation_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [0.0135s] [ 48%] 2024-08-06T21:58:00.0301271Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_missing_output_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-06T21:58:00.0303986Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_tensor_input_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py W0806 21:47:20.759000 16396 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:58:00.0305584Z W0806 21:47:20.773000 16396 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:58:00.0306215Z W0806 21:47:35.166000 16396 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:58:00.0306841Z W0806 21:47:35.178000 16396 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T21:58:00.0307365Z PASSED [29.7430s] [ 51%] 2024-08-06T21:58:00.0308876Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0028s] (requires CUDA) [ 53%] 2024-08-06T21:58:00.0311492Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0026s] (requires CUDA) [ 55%] 2024-08-06T21:58:00.0313914Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_view_outputs_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.2786s] [ 56%] 2024-08-06T21:58:00.0316404Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_while_loop_simple_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 58%] 2024-08-06T21:58:00.0318825Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cuda W0806 21:47:57.797000 16396 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-06T21:58:00.0320892Z W0806 21:47:57.806000 16396 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-06T21:58:00.0321882Z PASSED [8.4789s] [ 60%] 2024-08-06T21:58:00.0322998Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_dynamic_smem_above_default_limit_abi_compatible_cuda SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 62%] 2024-08-06T21:58:00.0324662Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fake_tensor_device_validation_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [0.0349s] [ 63%] 2024-08-06T21:58:00.0326165Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_missing_cubin_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [29.1281s] [ 65%] 2024-08-06T21:58:00.0328892Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_zero_size_weight_abi_compatible_cuda <- test/inductor/test_torchinductor.py /tmp/tmpum_r4sbw/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/c2lvv4oihvl6z5sv6azns2pcwgzqfoa2e7wrnwqnpdgnkn73ae4h.cpp: In member function ‘void torch::aot_inductor::AOTInductorModel::run_impl(AtenTensorOpaque**, AtenTensorOpaque**, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle)’: 2024-08-06T21:58:00.0331726Z /tmp/tmpum_r4sbw/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/c2lvv4oihvl6z5sv6azns2pcwgzqfoa2e7wrnwqnpdgnkn73ae4h.cpp:603:10: warning: variable ‘L__self___net_0_weight’ set but not used [-Wunused-but-set-variable] 2024-08-06T21:58:00.0332959Z 603 | auto L__self___net_0_weight = constants_->at(0); 2024-08-06T21:58:00.0333293Z | ^~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0333624Z PASSED [8.0364s] [ 67%] 2024-08-06T21:58:00.0334845Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_buffer_mutation_4_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 68%] 2024-08-06T21:58:00.0336412Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_int_list_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.3083s] [ 70%] 2024-08-06T21:58:00.0337839Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_misc_1_max_autotune_True_non_abi_compatible_cpu E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-06T21:58:00.0338870Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0339332Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-06T21:58:00.0343483Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.so 2024-08-06T21:58:00.0347831Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0348301Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-06T21:58:00.0349744Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0352130Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0353610Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0354312Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0355816Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0358152Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0359815Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0360584Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0362093Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0364091Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0366045Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0367222Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] 208 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0367834Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0368422Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0370074Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0371934Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:242:25: required from here 2024-08-06T21:58:00.0373637Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0375060Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0375865Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0376572Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0378292Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0380090Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:255:25: required from here 2024-08-06T21:58:00.0381787Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0383081Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0383890Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0384778Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0385868Z E0806 21:49:00.615000 16396 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-06T21:58:00.0386943Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-06T21:58:00.0387551Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-06T21:58:00.0388060Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0388530Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-06T21:58:00.0393321Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.so 2024-08-06T21:58:00.0397811Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0398275Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-06T21:58:00.0399768Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0402170Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0403628Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0404330Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0405850Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0408187Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0409731Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0410495Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0412233Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0414442Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0416341Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0417621Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] 208 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0418223Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0418822Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0420487Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0422283Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:242:25: required from here 2024-08-06T21:58:00.0424030Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0425338Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0426142Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0426852Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0429833Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0431638Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:255:25: required from here 2024-08-06T21:58:00.0433346Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/ye/cye7jusntxjdxghha6tqp6exhehvia4isrhq675zg6gtoldvozsv.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0434646Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0435443Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0436159Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0436722Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] . 2024-08-06T21:58:00.0437239Z E0806 21:49:02.300000 16396 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-06T21:58:00.0437833Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-06T21:58:00.0438486Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0438948Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-06T21:58:00.0443080Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.so 2024-08-06T21:58:00.0447437Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0447892Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-06T21:58:00.0449378Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0451765Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0453233Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0453922Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0455611Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0457935Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0459470Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0460230Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0461719Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0463702Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0465769Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0466939Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0467541Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0468122Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0469782Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0471625Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:198:25: required from here 2024-08-06T21:58:00.0473297Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0474584Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0475439Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0476201Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0477906Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0479672Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:211:25: required from here 2024-08-06T21:58:00.0481345Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0482637Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0483444Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0484145Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0485225Z E0806 21:49:03.881000 16396 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-06T21:58:00.0486302Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-06T21:58:00.0486903Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-06T21:58:00.0487391Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0487846Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-06T21:58:00.0492674Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.so 2024-08-06T21:58:00.0497183Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0497649Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-06T21:58:00.0499086Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0501468Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0503007Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0503697Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0505191Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0507491Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0509026Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0509782Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0511282Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0513251Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0515134Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0516305Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0517032Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0517612Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0519264Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0521050Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:198:25: required from here 2024-08-06T21:58:00.0522784Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0524079Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0524873Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0525594Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0527371Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0529160Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:211:25: required from here 2024-08-06T21:58:00.0530838Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/k7/ck7awqg2ctbxl3happ73332psywyztqmboy6s2tc24ggzdmkz7w7.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0532121Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0532920Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0533628Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0534179Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] . 2024-08-06T21:58:00.0534816Z E0806 21:49:05.504000 16396 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-06T21:58:00.0535412Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-06T21:58:00.0535930Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0536383Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-06T21:58:00.0540615Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.so 2024-08-06T21:58:00.0544885Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] 2024-08-06T21:58:00.0545347Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-06T21:58:00.0546755Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0549169Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0550623Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0551310Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0552859Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0555178Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0556762Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0557518Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0559017Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0561009Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0562885Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0564052Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0564658Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0565245Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0567046Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0568835Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:198:25: required from here 2024-08-06T21:58:00.0570516Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0571869Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0572714Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0573415Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0575240Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0577032Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:211:25: required from here 2024-08-06T21:58:00.0578841Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0580128Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0580938Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0581645Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-06T21:58:00.0582730Z E0806 21:49:07.092000 16396 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-06T21:58:00.0583804Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-06T21:58:00.0584407Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-06T21:58:00.0584900Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0585351Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-06T21:58:00.0589697Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.so 2024-08-06T21:58:00.0594444Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] 2024-08-06T21:58:00.0594903Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-06T21:58:00.0596337Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0598725Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-06T21:58:00.0600372Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-06T21:58:00.0601071Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-06T21:58:00.0602571Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-06T21:58:00.0604884Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-06T21:58:00.0606537Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0607306Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0608804Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-06T21:58:00.0610785Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-06T21:58:00.0612681Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0613854Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-06T21:58:00.0614590Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0615170Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0616823Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-06T21:58:00.0618612Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:198:25: required from here 2024-08-06T21:58:00.0620447Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0621742Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0622539Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0623242Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0624937Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-06T21:58:00.0626775Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:211:25: required from here 2024-08-06T21:58:00.0628461Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] /tmp/tmphvpzxlav/rl/crlyjovlbaw3v52urwd64uy6bupo2nw3sxjxxrgcvmi5tugr5dnp.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-06T21:58:00.0629746Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-06T21:58:00.0630587Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-06T21:58:00.0631294Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-06T21:58:00.0631852Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] . 2024-08-06T21:58:00.0632354Z E0806 21:49:08.707000 16396 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-06T21:58:00.0632845Z PASSED [25.1891s] [ 72%] 2024-08-06T21:58:00.0633863Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_model_modified_weights_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.6976s] [ 74%] 2024-08-06T21:58:00.0635390Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_output_path_1_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.6703s] [ 75%] 2024-08-06T21:58:00.0637028Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_quantized_linear_non_abi_compatible_cpu [W806 21:49:55.926974732 QuantizedLinear.cpp:383] Warning: fbgemm_pack_gemm_matrix_fp16 is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-06T21:58:00.0638586Z [W806 21:50:10.666318221 QuantizedLinear.cpp:418] Warning: fbgemm_linear_fp16_weight_fp32_activation is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-06T21:58:00.0639383Z PASSED [14.7959s] [ 77%] 2024-08-06T21:58:00.0640365Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_reuse_kernel_dynamic_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [16.7222s] [ 79%] 2024-08-06T21:58:00.0641846Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_fp8_non_abi_compatible_cpu SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 81%] 2024-08-06T21:58:00.0643452Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0031s] (requires CUDA) [ 82%] 2024-08-06T21:58:00.0645274Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_non_abi_compatible_cpu SKIPPED [0.0029s] (requires CUDA) [ 84%] 2024-08-06T21:58:00.0646869Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_zero_size_weight_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.5192s] [ 86%] 2024-08-06T21:58:00.0648384Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_2_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4970s] [ 87%] 2024-08-06T21:58:00.0649893Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_int_list_input_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.3033s] [ 89%] 2024-08-06T21:58:00.0651488Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_nested_tensor_from_jagged_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.7977s] [ 91%] 2024-08-06T21:58:00.0653066Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_scatter_reduce_fallback_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.9360s] [ 93%] 2024-08-06T21:58:00.0654708Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_equal_to_1_float_arg_dynamic_True_non_abi_compatible_cuda PASSED [16.3593s] [ 94%] 2024-08-06T21:58:00.0656242Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_non_abi_compatible_cuda PASSED [21.2965s] [ 96%] 2024-08-06T21:58:00.0657782Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_non_abi_compatible_cuda PASSED [16.5132s] [ 98%] 2024-08-06T21:58:00.0659182Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda PASSED [0.2983s] [100%] 2024-08-06T21:58:00.0659812Z 2024-08-06T21:58:00.0660378Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-a62140c89544360f.xml - 2024-08-06T21:58:00.0661376Z ================== 35 passed, 23 skipped in 433.88s (0:07:13) ================== 2024-08-06T21:58:00.0661810Z Got exit code -11 (SIGSEGV) 2024-08-06T21:58:00.0662060Z Retrying single test... 2024-08-06T21:58:00.0662635Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-cc0f66fcca588979.xml 2024-08-06T21:58:00.0663379Z ============================= test session starts ============================== 2024-08-06T21:58:00.0663905Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T21:58:00.0664361Z cachedir: .pytest_cache 2024-08-06T21:58:00.0664887Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T21:58:00.0665469Z rootdir: /var/lib/jenkins/workspace 2024-08-06T21:58:00.0665768Z configfile: pytest.ini 2024-08-06T21:58:00.0666235Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T21:58:00.0666904Z collecting ... collected 912 items / 57 deselected / 855 selected 2024-08-06T21:58:00.0667818Z stepcurrent: skipping 57 already run items. Running only test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda 2024-08-06T21:58:00.0668626Z Running 1 items in this shard 2024-08-06T21:58:00.0668792Z 2024-08-06T21:58:00.0669555Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda PASSED [1.4074s] [100%] 2024-08-06T21:58:00.0670191Z 2024-08-06T21:58:00.0670744Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-cc0f66fcca588979.xml - 2024-08-06T21:58:00.0671731Z ======================= 1 passed, 57 deselected in 1.49s ======================= 2024-08-06T21:58:00.0672147Z Got exit code 0 2024-08-06T21:58:00.0672470Z Test succeeeded in new process, continuing with the rest of the tests 2024-08-06T21:58:00.0673181Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-7ce886a009114a1b.xml 2024-08-06T21:58:00.0674009Z ============================= test session starts ============================== 2024-08-06T21:58:00.0674517Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T21:58:00.0674968Z cachedir: .pytest_cache 2024-08-06T21:58:00.0675505Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T21:58:00.0676077Z rootdir: /var/lib/jenkins/workspace 2024-08-06T21:58:00.0676352Z configfile: pytest.ini 2024-08-06T21:58:00.0676795Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T21:58:00.0677449Z collecting ... collected 912 items / 58 deselected / 854 selected 2024-08-06T21:58:00.0677902Z stepcurrent: skipping 58 already run items. 2024-08-06T21:58:00.0678205Z Running 0 items in this shard 2024-08-06T21:58:00.0678370Z 2024-08-06T21:58:00.0678927Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-7ce886a009114a1b.xml - 2024-08-06T21:58:00.0679879Z ============================ 58 deselected in 0.07s ============================ 2024-08-06T21:58:00.0680861Z The following tests failed and then succeeded when run in a new process['ul', 'test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda'] 2024-08-06T21:58:00.0681619Z 2024-08-06T21:58:00.0682044Z FINISHED PRINTING LOG FILE of inductor/test_aot_inductor 7/16 (test/test-reports/inductor.test_aot_inductor_7.16_e50ead3b82712b34_.log) 2024-08-06T21:58:00.0682551Z 2024-08-06T21:58:03.1272417Z Running inductor/test_torchinductor_dynamic_shapes 3/6 ... [2024-08-06 21:58:03.126671] 2024-08-06T21:58:03.1273927Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'not serial', '--shard-id=3', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 21:58:03.127048] 2024-08-06T22:02:49.5231427Z 2024-08-06T22:02:49.5232903Z inductor/test_torchinductor_dynamic_shapes 2/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_2.6_bb4b20634ee035ca_.log 2024-08-06T22:02:49.5339342Z Running 246 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_override_registration_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_with_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_arange1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_as_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_buffer_batch_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_float_ndigits_neg_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_unbacked_legacy_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cauchy_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_clamp_type_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_complex_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_computed_buffer_inlining_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_concat_add_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_config_option_dont_assume_alignment_cudagraphs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_const_int32_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_convolution1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_convolution4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cos_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cumsum_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_by_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_prim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout_trivial_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_elu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_exp2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_expand_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_expand_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_flip_cat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_forced_buffer_realize_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fuse_large_params_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gather_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inductor_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inf_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inner_fn_str_and_stride_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_add_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_mixed_dtype_ops_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_where_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_isinf_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mean_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_min_max_reduction_nan_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_move_arange_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_to_num_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_narrow_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_neg_max_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_new_empty_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nll_loss_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nonzero_unbacked_refinement_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pixel_shuffle_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_y1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_chebyshev_polynomial_t_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_chebyshev_polynomial_w_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_entr_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_gammaln_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_i0e_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_psi_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_scaled_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_shifted_chebyshev_polynomial_v_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_prod_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_randint_int64_mod_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_randint_kernel_count_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reflection_pad2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reflection_pad2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_as_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_resize_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_roi_align_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_setitem_with_int_parameter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_single_elem_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_mutation3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_view_with_graph_break_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_softmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sort_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tan_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tensor_index_put_slice_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tmp_not_defined_issue2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_transpose_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_uint4x2_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unbacked_floordiv_simplify_errors_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unbind_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_var_correction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_as_real_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_where_with_logical_op_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_zero_element_mutation_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_abs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool2d_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aliased_buffer_reuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aoti_eager_cache_hit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_argmin2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_argmin3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_to_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_as_strided_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_batch_norm_2d_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_computed_offsets_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_default_kwargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_builtins_round_float_ndigits_pos_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_builtins_round_int_ndigits_zero_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_config_option_dont_assume_alignment_cudagraphs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_config_option_dont_assume_alignment_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_fixed_layout_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_scan_op_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div9_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_by_zero_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_precision_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtype_mismatch_issue_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_erfinv_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_expanded_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_basic_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fill2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_flip_cat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_float_index_expression_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fractional_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_full_boolean_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_full_truncation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gather1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gather2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_getitem_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_nested_indirect_indexing_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_fallback2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_select_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_layout_optimization_input_mutations_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inner_fn_str_and_stride_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_add_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_issue102546_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_kwargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_strided_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_leaky_relu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_log1p_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mean_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mm_mixed_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_move_arange_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_prime_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_neg_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_no_op_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_nonzero_unbacked_refinement_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pad_view_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pattern_matcher_multi_user_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_permute1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_chebyshev_polynomial_w_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_erf_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_gammaincc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_gammaln_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_hermite_polynomial_h_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_scaled_modified_bessel_k0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_shifted_chebyshev_polynomial_t_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_zeta_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_repeat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_require_stride_expanded_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_roll_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scalar_output_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scaled_dot_product_efficient_attention_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_add2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_add3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_select_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_shape_padding_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_shape_prop_torch_ones_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_signbit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_single_elem_indirect_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sort_stable_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_failed_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_with_integer_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_with_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tanh_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tensor3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tensor_index_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tmp_not_defined_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tmp_not_defined_issue3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unsqueeze_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_bilinear2d_b_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_correction_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_mean_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_uint8_through_differing_bitwidths_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_views6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_views7_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_where_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_zero_dim_reductions_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_abs_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_arange_dynamic_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_cat_unbacked_duplicate_size_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_float_item_inf_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_float_item_return_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_zeros_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op10_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op4_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op9_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_noops_tensor_repropagate_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_return_unbacked_view_split_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_slice_scatter_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_sym_stride_lowering_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_symint_sum_list_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_cat_backwards_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_wrapper_codegen_statically_known_int_or_none_cuda 2024-08-06T22:02:49.5443953Z 2024-08-06T22:02:52.6960093Z Running inductor/test_torchinductor_dynamic_shapes 4/6 ... [2024-08-06 22:02:52.695322] 2024-08-06T22:02:52.6961621Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'not serial', '--shard-id=4', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:02:52.695697] 2024-08-06T22:03:42.1709660Z 2024-08-06T22:03:42.1711107Z PRINTING LOG FILE of inductor/test_aot_inductor 9/16 (test/test-reports/inductor.test_aot_inductor_9.16_37c7619e1459b5df_.log) 2024-08-06T22:03:42.1714312Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-35cbc1aa0a45a81b.xml 2024-08-06T22:03:42.1715613Z ============================= test session starts ============================== 2024-08-06T22:03:42.1716430Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T22:03:42.1717050Z cachedir: .pytest_cache 2024-08-06T22:03:42.1717808Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T22:03:42.1718494Z rootdir: /var/lib/jenkins/workspace 2024-08-06T22:03:42.1719083Z configfile: pytest.ini 2024-08-06T22:03:42.1724757Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T22:03:42.1725582Z collecting ... collected 912 items 2024-08-06T22:03:42.1725995Z stepcurrent: Cannot find last run test, not skipping 2024-08-06T22:03:42.1766163Z Running 64 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_reuse_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_conv_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_duplicated_params_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_cat_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_embedding_bag_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fqn_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_misaligned_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_simple_split_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_sympy_fn_like_arg_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_nested_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_consecutive_compiles_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_empty_graph_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_multi_device_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_quantized_linear_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_view_outputs_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_while_loop_with_outer_code_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_embedding_bag_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_empty_graph_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_default_cuda_device_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_tensor_input_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_shape_failed_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_seq_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_with_none_input_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_unsupported_input_dtype_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_amp_fallback_random_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_dup_unbacked_sym_decl_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_embedding_bag_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_foreach_multiple_dynamic_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fp8_view_of_param_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_index_put_with_none_index_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda 2024-08-06T22:03:42.1804164Z 2024-08-06T22:03:42.1805140Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.1203s] [ 1%] 2024-08-06T22:03:42.1806645Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [16.5632s] [ 3%] 2024-08-06T22:03:42.1808296Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 4%] 2024-08-06T22:03:42.1809863Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.5790s] [ 6%] 2024-08-06T22:03:42.1811338Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4224s] [ 7%] 2024-08-06T22:03:42.1812780Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.6510s] [ 9%] 2024-08-06T22:03:42.1814732Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0806 21:41:08.730000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1815908Z W0806 21:41:08.743000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1816530Z W0806 21:41:22.997000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1817150Z W0806 21:41:23.010000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1817679Z PASSED [29.6231s] [ 10%] 2024-08-06T22:03:42.1818655Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.7486s] [ 12%] 2024-08-06T22:03:42.1820468Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0033s] (requires CUDA) [ 14%] 2024-08-06T22:03:42.1822072Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0030s] (requires CUDA) [ 15%] 2024-08-06T22:03:42.1823653Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0029s] (requires CUDA) [ 17%] 2024-08-06T22:03:42.1825298Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 18%] 2024-08-06T22:03:42.1827076Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (requires CUDA) [ 20%] 2024-08-06T22:03:42.1828943Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (Skipped!) [ 21%] 2024-08-06T22:03:42.1831111Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py In file included from /tmp/tmpfkahbo5l/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/cu6ibe4ozcrsalnzrssfj2ghelijfoltkr6lum43kc6e7zaceo33.cpp:5: 2024-08-06T22:03:42.1833985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h: In instantiation of ‘void torch::aot_inductor::ThreadLocalCachedOutputTensor >::copy_data_from(const torch::aot_inductor::ArrayRefTensor&) [with T = float]’: 2024-08-06T22:03:42.1835880Z /tmp/tmpfkahbo5l/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/cu6ibe4ozcrsalnzrssfj2ghelijfoltkr6lum43kc6e7zaceo33.cpp:633:44: required from here 2024-08-06T22:03:42.1837702Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h:53:19: warning: comparison of integer expressions of different signedness: ‘long unsigned int’ and ‘int64_t’ {aka ‘long int’} [-Wsign-compare] 2024-08-06T22:03:42.1842125Z 53 | if (t.numel() > capacity_) { 2024-08-06T22:03:42.1842494Z PASSED [7.7144s] [ 23%] 2024-08-06T22:03:42.1843640Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [6.4626s] [ 25%] 2024-08-06T22:03:42.1845506Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 26%] 2024-08-06T22:03:42.1848979Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.3285s] [ 28%] 2024-08-06T22:03:42.1850850Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [14.2684s] [ 29%] 2024-08-06T22:03:42.1852581Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (Skipped!) [ 31%] 2024-08-06T22:03:42.1854624Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.2986s] [ 32%] 2024-08-06T22:03:42.1856597Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0031s] (requires CUDA) [ 34%] 2024-08-06T22:03:42.1858459Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 35%] 2024-08-06T22:03:42.1860739Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0028s] (requires CUDA) [ 37%] 2024-08-06T22:03:42.1862750Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0028s] (requires CUDA) [ 39%] 2024-08-06T22:03:42.1864785Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0031s] (requires CUDA) [ 40%] 2024-08-06T22:03:42.1867246Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 42%] 2024-08-06T22:03:42.1869721Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_reuse_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 43%] 2024-08-06T22:03:42.1872158Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_conv_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 45%] 2024-08-06T22:03:42.1874612Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_duplicated_params_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 46%] 2024-08-06T22:03:42.1877289Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_cat_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.1879485Z from /tmp/tmp23287aqd/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cdjhlszf2e7jkj3lr6v54au37y3h3dc7xlrn5v52n2duumvf5ezu.cpp:1: 2024-08-06T22:03:42.1882805Z /tmp/tmp23287aqd/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cdjhlszf2e7jkj3lr6v54au37y3h3dc7xlrn5v52n2duumvf5ezu.cpp: In member function ‘Outputs torch::aot_inductor::AOTInductorModel::run_impl_minimal_arrayref_interface(const Inputs&, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle) [with Inputs = std::tuple, torch::aot_inductor::ArrayRefTensor >; Outputs = std::tuple >; torch::aot_inductor::DeviceStreamType = void*; AOTIProxyExecutorHandle = AOTIProxyExecutorOpaque*]’: 2024-08-06T22:03:42.1886422Z /tmp/tmp23287aqd/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cdjhlszf2e7jkj3lr6v54au37y3h3dc7xlrn5v52n2duumvf5ezu.cpp:549:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-06T22:03:42.1887858Z 549 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg0_1, &arg0_1_size)); 2024-08-06T22:03:42.1888270Z | ^~~~~~ 2024-08-06T22:03:42.1888565Z | | 2024-08-06T22:03:42.1888907Z | torch::aot_inductor::ArrayRefTensor 2024-08-06T22:03:42.1889957Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-06T22:03:42.1890775Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-06T22:03:42.1891064Z | ^~~~ 2024-08-06T22:03:42.1891627Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-06T22:03:42.1893089Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.1894149Z from /tmp/tmp23287aqd/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cdjhlszf2e7jkj3lr6v54au37y3h3dc7xlrn5v52n2duumvf5ezu.cpp:1: 2024-08-06T22:03:42.1895870Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-06T22:03:42.1896809Z 207 | AtenTensorHandle tensor, 2024-08-06T22:03:42.1897072Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-06T22:03:42.1897720Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.1898834Z from /tmp/tmp23287aqd/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cdjhlszf2e7jkj3lr6v54au37y3h3dc7xlrn5v52n2duumvf5ezu.cpp:1: 2024-08-06T22:03:42.1900715Z /tmp/tmp23287aqd/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cdjhlszf2e7jkj3lr6v54au37y3h3dc7xlrn5v52n2duumvf5ezu.cpp:552:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-06T22:03:42.1901969Z 552 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg1_1, &arg1_1_size)); 2024-08-06T22:03:42.1902382Z | ^~~~~~ 2024-08-06T22:03:42.1902704Z | | 2024-08-06T22:03:42.1903045Z | torch::aot_inductor::ArrayRefTensor 2024-08-06T22:03:42.1904002Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-06T22:03:42.1905034Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-06T22:03:42.1905334Z | ^~~~ 2024-08-06T22:03:42.1905885Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-06T22:03:42.1906834Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.1907896Z from /tmp/tmp23287aqd/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cdjhlszf2e7jkj3lr6v54au37y3h3dc7xlrn5v52n2duumvf5ezu.cpp:1: 2024-08-06T22:03:42.1909445Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-06T22:03:42.1910557Z 207 | AtenTensorHandle tensor, 2024-08-06T22:03:42.1910830Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-06T22:03:42.1911146Z XFAIL [2.8944s] [ 48%] 2024-08-06T22:03:42.1912618Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_embedding_bag_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-06T22:03:42.1915037Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fqn_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 51%] 2024-08-06T22:03:42.1917583Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.2464s] [ 53%] 2024-08-06T22:03:42.1920052Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_misaligned_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 54%] 2024-08-06T22:03:42.1922494Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 56%] 2024-08-06T22:03:42.1924765Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_simple_split_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 57%] 2024-08-06T22:03:42.1927200Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_sympy_fn_like_arg_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0030s] (requires CUDA) [ 59%] 2024-08-06T22:03:42.1929172Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_nested_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [12.9639s] [ 60%] 2024-08-06T22:03:42.1930644Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_consecutive_compiles_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.6241s] [ 62%] 2024-08-06T22:03:42.1932110Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_empty_graph_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.1187s] [ 64%] 2024-08-06T22:03:42.1933632Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_multi_device_abi_compatible_cuda <- test/inductor/test_torchinductor.py W0806 21:43:17.146000 14324 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-06T22:03:42.1934836Z W0806 21:43:17.148000 14324 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-06T22:03:42.1935358Z PASSED [10.1725s] [ 65%] 2024-08-06T22:03:42.1936189Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_quantized_linear_abi_compatible_cuda SKIPPED [0.0002s] (Skipped!) [ 67%] 2024-08-06T22:03:42.1937560Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cuda PASSED [9.0559s] [ 68%] 2024-08-06T22:03:42.1939161Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_abi_compatible_cuda PASSED [12.8956s] [ 70%] 2024-08-06T22:03:42.1940643Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cuda PASSED [8.3307s] [ 71%] 2024-08-06T22:03:42.1942092Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_view_outputs_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.7680s] [ 73%] 2024-08-06T22:03:42.1943567Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_while_loop_with_outer_code_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.9421s] [ 75%] 2024-08-06T22:03:42.1945107Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_embedding_bag_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.5849s] [ 76%] 2024-08-06T22:03:42.1946588Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_empty_graph_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.7343s] [ 78%] 2024-08-06T22:03:42.1948059Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_default_cuda_device_non_abi_compatible_cpu SKIPPED [0.0002s] (requires multiple cuda devices) [ 79%] 2024-08-06T22:03:42.1949588Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_tensor_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0806 21:44:43.599000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1950763Z W0806 21:44:43.611000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1951389Z W0806 21:44:57.903000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1952015Z W0806 21:44:57.915000 14324 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1952531Z PASSED [29.9180s] [ 81%] 2024-08-06T22:03:42.1953660Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_shape_failed_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 82%] 2024-08-06T22:03:42.1955651Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_seq_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py In file included from /tmp/tmpg71jkpqb/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/cho4howegwiyf4ikcgu5ofuyppghx6noznthsh6xrxe7elj4txuu.cpp:463: 2024-08-06T22:03:42.1957922Z /tmp/tmpg71jkpqb/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h: In instantiation of ‘Welford welford_combine(const Welford&, T, const WeightRecp*) [with T = at::vec::CPU_CAPABILITY::Vectorized]’: 2024-08-06T22:03:42.1959458Z /tmp/tmpg71jkpqb/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/cho4howegwiyf4ikcgu5ofuyppghx6noznthsh6xrxe7elj4txuu.cpp:482:81: required from here 2024-08-06T22:03:42.1961573Z /tmp/tmpg71jkpqb/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h:107:35: warning: comparison of integer expressions of different signedness: ‘const int64_t’ {aka ‘const long int’} and ‘std::vector >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare] 2024-08-06T22:03:42.1962821Z 107 | ((w == nullptr || acc.index >= w->weight_recps.size()) 2024-08-06T22:03:42.1963212Z PASSED [15.8003s] [ 84%] 2024-08-06T22:03:42.1964262Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_non_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 85%] 2024-08-06T22:03:42.1966063Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_with_none_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (requires CUDA) [ 87%] 2024-08-06T22:03:42.1967593Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_unsupported_input_dtype_non_abi_compatible_cpu PASSED [0.0378s] [ 89%] 2024-08-06T22:03:42.1969006Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_amp_fallback_random_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4998s] [ 90%] 2024-08-06T22:03:42.1970555Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_dup_unbacked_sym_decl_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.9103s] [ 92%] 2024-08-06T22:03:42.1972123Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_embedding_bag_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.7213s] [ 93%] 2024-08-06T22:03:42.1973679Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_foreach_multiple_dynamic_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4619s] [ 95%] 2024-08-06T22:03:42.1975334Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fp8_view_of_param_non_abi_compatible_cuda SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 96%] 2024-08-06T22:03:42.1976836Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_index_put_with_none_index_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.5812s] [ 98%] 2024-08-06T22:03:42.1978366Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda SKIPPED [0.0002s] (requires multiple cuda devices) [100%] 2024-08-06T22:03:42.1979102Z 2024-08-06T22:03:42.1979670Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-35cbc1aa0a45a81b.xml - 2024-08-06T22:03:42.1980720Z ============ 32 passed, 31 skipped, 1 xfailed in 390.25s (0:06:30) ============= 2024-08-06T22:03:42.1981188Z Got exit code -11 (SIGSEGV) 2024-08-06T22:03:42.1981443Z Retrying single test... 2024-08-06T22:03:42.1982018Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-d858e15221cdc00b.xml 2024-08-06T22:03:42.1982748Z ============================= test session starts ============================== 2024-08-06T22:03:42.1983270Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T22:03:42.1983733Z cachedir: .pytest_cache 2024-08-06T22:03:42.1984262Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T22:03:42.1984845Z rootdir: /var/lib/jenkins/workspace 2024-08-06T22:03:42.1985144Z configfile: pytest.ini 2024-08-06T22:03:42.1985615Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T22:03:42.1986167Z collecting ... collected 912 items 2024-08-06T22:03:42.1986575Z stepcurrent: Cannot find last run test, not skipping 2024-08-06T22:03:42.1986908Z Running 64 items in this shard 2024-08-06T22:03:42.1987072Z 2024-08-06T22:03:42.1987805Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.9190s] [ 1%] 2024-08-06T22:03:42.1989257Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [16.6102s] [ 3%] 2024-08-06T22:03:42.1990758Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 4%] 2024-08-06T22:03:42.1993074Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4566s] [ 6%] 2024-08-06T22:03:42.1994573Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.2728s] [ 7%] 2024-08-06T22:03:42.1996044Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.6569s] [ 9%] 2024-08-06T22:03:42.1997636Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0806 21:52:46.302000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1998795Z W0806 21:52:46.320000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.1999414Z W0806 21:53:00.542000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.2000039Z W0806 21:53:00.554000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.2000551Z PASSED [29.2696s] [ 10%] 2024-08-06T22:03:42.2001494Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.5441s] [ 12%] 2024-08-06T22:03:42.2003055Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 14%] 2024-08-06T22:03:42.2004663Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 15%] 2024-08-06T22:03:42.2006253Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0029s] (requires CUDA) [ 17%] 2024-08-06T22:03:42.2008534Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 18%] 2024-08-06T22:03:42.2010407Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0028s] (requires CUDA) [ 20%] 2024-08-06T22:03:42.2012413Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0001s] (Skipped!) [ 21%] 2024-08-06T22:03:42.2015056Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py In file included from /tmp/tmplqhm0ia0/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/ci4sjwleq3g55antp4vteodem34nbrmqxvqwlx7amljpks6b5y77.cpp:5: 2024-08-06T22:03:42.2018085Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h: In instantiation of ‘void torch::aot_inductor::ThreadLocalCachedOutputTensor >::copy_data_from(const torch::aot_inductor::ArrayRefTensor&) [with T = float]’: 2024-08-06T22:03:42.2019939Z /tmp/tmplqhm0ia0/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/ci4sjwleq3g55antp4vteodem34nbrmqxvqwlx7amljpks6b5y77.cpp:633:44: required from here 2024-08-06T22:03:42.2022062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h:53:19: warning: comparison of integer expressions of different signedness: ‘long unsigned int’ and ‘int64_t’ {aka ‘long int’} [-Wsign-compare] 2024-08-06T22:03:42.2023239Z 53 | if (t.numel() > capacity_) { 2024-08-06T22:03:42.2023596Z PASSED [7.5884s] [ 23%] 2024-08-06T22:03:42.2024730Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [6.4430s] [ 25%] 2024-08-06T22:03:42.2026531Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 26%] 2024-08-06T22:03:42.2028984Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.1853s] [ 28%] 2024-08-06T22:03:42.2030875Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [14.0493s] [ 29%] 2024-08-06T22:03:42.2032615Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (Skipped!) [ 31%] 2024-08-06T22:03:42.2034380Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.1440s] [ 32%] 2024-08-06T22:03:42.2036251Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0032s] (requires CUDA) [ 34%] 2024-08-06T22:03:42.2038109Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 35%] 2024-08-06T22:03:42.2040030Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0028s] (requires CUDA) [ 37%] 2024-08-06T22:03:42.2042028Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 39%] 2024-08-06T22:03:42.2044043Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0032s] (requires CUDA) [ 40%] 2024-08-06T22:03:42.2046364Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 42%] 2024-08-06T22:03:42.2048824Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_reuse_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 43%] 2024-08-06T22:03:42.2051352Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_conv_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 45%] 2024-08-06T22:03:42.2053894Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_duplicated_params_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 46%] 2024-08-06T22:03:42.2056713Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_cat_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.2058864Z from /tmp/tmp7dee4cnx/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/chvmlvynihh6ph35cqonnxjtd4tsbqjnpa73xosapfgjm3u74sws.cpp:1: 2024-08-06T22:03:42.2062165Z /tmp/tmp7dee4cnx/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/chvmlvynihh6ph35cqonnxjtd4tsbqjnpa73xosapfgjm3u74sws.cpp: In member function ‘Outputs torch::aot_inductor::AOTInductorModel::run_impl_minimal_arrayref_interface(const Inputs&, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle) [with Inputs = std::tuple, torch::aot_inductor::ArrayRefTensor >; Outputs = std::tuple >; torch::aot_inductor::DeviceStreamType = void*; AOTIProxyExecutorHandle = AOTIProxyExecutorOpaque*]’: 2024-08-06T22:03:42.2065626Z /tmp/tmp7dee4cnx/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/chvmlvynihh6ph35cqonnxjtd4tsbqjnpa73xosapfgjm3u74sws.cpp:549:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-06T22:03:42.2067228Z 549 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg0_1, &arg0_1_size)); 2024-08-06T22:03:42.2067714Z | ^~~~~~ 2024-08-06T22:03:42.2068019Z | | 2024-08-06T22:03:42.2068361Z | torch::aot_inductor::ArrayRefTensor 2024-08-06T22:03:42.2069342Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-06T22:03:42.2070150Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-06T22:03:42.2070444Z | ^~~~ 2024-08-06T22:03:42.2071006Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-06T22:03:42.2071967Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.2073063Z from /tmp/tmp7dee4cnx/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/chvmlvynihh6ph35cqonnxjtd4tsbqjnpa73xosapfgjm3u74sws.cpp:1: 2024-08-06T22:03:42.2074694Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-06T22:03:42.2087461Z 207 | AtenTensorHandle tensor, 2024-08-06T22:03:42.2087818Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-06T22:03:42.2088484Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.2089647Z from /tmp/tmp7dee4cnx/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/chvmlvynihh6ph35cqonnxjtd4tsbqjnpa73xosapfgjm3u74sws.cpp:1: 2024-08-06T22:03:42.2092003Z /tmp/tmp7dee4cnx/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/chvmlvynihh6ph35cqonnxjtd4tsbqjnpa73xosapfgjm3u74sws.cpp:552:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-06T22:03:42.2093783Z 552 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg1_1, &arg1_1_size)); 2024-08-06T22:03:42.2094203Z | ^~~~~~ 2024-08-06T22:03:42.2094628Z | | 2024-08-06T22:03:42.2094971Z | torch::aot_inductor::ArrayRefTensor 2024-08-06T22:03:42.2095979Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-06T22:03:42.2097002Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-06T22:03:42.2097303Z | ^~~~ 2024-08-06T22:03:42.2097881Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-06T22:03:42.2098843Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-06T22:03:42.2099931Z from /tmp/tmp7dee4cnx/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/chvmlvynihh6ph35cqonnxjtd4tsbqjnpa73xosapfgjm3u74sws.cpp:1: 2024-08-06T22:03:42.2101503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-06T22:03:42.2102451Z 207 | AtenTensorHandle tensor, 2024-08-06T22:03:42.2102731Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-06T22:03:42.2103046Z XFAIL [2.9526s] [ 48%] 2024-08-06T22:03:42.2104546Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_embedding_bag_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-06T22:03:42.2106994Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fqn_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 51%] 2024-08-06T22:03:42.2109436Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.3204s] [ 53%] 2024-08-06T22:03:42.2111933Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_misaligned_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 54%] 2024-08-06T22:03:42.2114545Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 56%] 2024-08-06T22:03:42.2116892Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_simple_split_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0001s] (Skipped!) [ 57%] 2024-08-06T22:03:42.2119664Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_sympy_fn_like_arg_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0028s] (requires CUDA) [ 59%] 2024-08-06T22:03:42.2121890Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_nested_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [12.8095s] [ 60%] 2024-08-06T22:03:42.2123374Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_consecutive_compiles_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.8113s] [ 62%] 2024-08-06T22:03:42.2124881Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_empty_graph_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.0994s] [ 64%] 2024-08-06T22:03:42.2126368Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_multi_device_abi_compatible_cuda <- test/inductor/test_torchinductor.py W0806 21:54:53.638000 19190 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-06T22:03:42.2127553Z W0806 21:54:53.639000 19190 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-06T22:03:42.2128148Z PASSED [9.9993s] [ 65%] 2024-08-06T22:03:42.2129097Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_quantized_linear_abi_compatible_cuda SKIPPED [0.0002s] (Skipped!) [ 67%] 2024-08-06T22:03:42.2130491Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cuda PASSED [8.9585s] [ 68%] 2024-08-06T22:03:42.2131986Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_abi_compatible_cuda PASSED [12.7622s] [ 70%] 2024-08-06T22:03:42.2133488Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cuda PASSED [8.0218s] [ 71%] 2024-08-06T22:03:42.2135109Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_view_outputs_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.7216s] [ 73%] 2024-08-06T22:03:42.2136588Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_while_loop_with_outer_code_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.9327s] [ 75%] 2024-08-06T22:03:42.2138088Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_embedding_bag_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.5004s] [ 76%] 2024-08-06T22:03:42.2139566Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_empty_graph_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.4991s] [ 78%] 2024-08-06T22:03:42.2141041Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_default_cuda_device_non_abi_compatible_cpu SKIPPED [0.0002s] (requires multiple cuda devices) [ 79%] 2024-08-06T22:03:42.2142683Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_tensor_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0806 21:56:19.006000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.2143874Z W0806 21:56:19.018000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.2144509Z W0806 21:56:33.297000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.2145142Z W0806 21:56:33.311000 19190 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-06T22:03:42.2145702Z PASSED [29.7327s] [ 81%] 2024-08-06T22:03:42.2146762Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_shape_failed_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 82%] 2024-08-06T22:03:42.2149197Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_seq_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py In file included from /tmp/tmpfawbtqrn/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/cuwlwsptjdbm6vouri5rxw5a6xvbw4mkscwsjanfwh43pt5wfcbg.cpp:463: 2024-08-06T22:03:42.2151512Z /tmp/tmpfawbtqrn/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h: In instantiation of ‘Welford welford_combine(const Welford&, T, const WeightRecp*) [with T = at::vec::CPU_CAPABILITY::Vectorized]’: 2024-08-06T22:03:42.2153028Z /tmp/tmpfawbtqrn/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/cuwlwsptjdbm6vouri5rxw5a6xvbw4mkscwsjanfwh43pt5wfcbg.cpp:482:81: required from here 2024-08-06T22:03:42.2155135Z /tmp/tmpfawbtqrn/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h:107:35: warning: comparison of integer expressions of different signedness: ‘const int64_t’ {aka ‘const long int’} and ‘std::vector >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare] 2024-08-06T22:03:42.2156386Z 107 | ((w == nullptr || acc.index >= w->weight_recps.size()) 2024-08-06T22:03:42.2156793Z PASSED [15.5959s] [ 84%] 2024-08-06T22:03:42.2157836Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_non_abi_compatible_cpu SKIPPED [0.0032s] (requires CUDA) [ 85%] 2024-08-06T22:03:42.2159516Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_with_none_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (requires CUDA) [ 87%] 2024-08-06T22:03:42.2161000Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_unsupported_input_dtype_non_abi_compatible_cpu PASSED [0.0387s] [ 89%] 2024-08-06T22:03:42.2162430Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_amp_fallback_random_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4360s] [ 90%] 2024-08-06T22:03:42.2163976Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_dup_unbacked_sym_decl_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.6431s] [ 92%] 2024-08-06T22:03:42.2165504Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_embedding_bag_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.5788s] [ 93%] 2024-08-06T22:03:42.2167059Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_foreach_multiple_dynamic_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.7648s] [ 95%] 2024-08-06T22:03:42.2168576Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fp8_view_of_param_non_abi_compatible_cuda SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 96%] 2024-08-06T22:03:42.2170160Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_index_put_with_none_index_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.3711s] [ 98%] 2024-08-06T22:03:42.2171689Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda SKIPPED [0.0002s] (requires multiple cuda devices) [100%] 2024-08-06T22:03:42.2172423Z 2024-08-06T22:03:42.2172989Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-d858e15221cdc00b.xml - 2024-08-06T22:03:42.2174044Z ============ 32 passed, 31 skipped, 1 xfailed in 386.89s (0:06:26) ============= 2024-08-06T22:03:42.2174652Z Got exit code -11 (SIGSEGV) 2024-08-06T22:03:42.2174985Z Retrying single test... 2024-08-06T22:03:42.2175618Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-a82215d487f8633b.xml 2024-08-06T22:03:42.2176381Z ============================= test session starts ============================== 2024-08-06T22:03:42.2176900Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T22:03:42.2177357Z cachedir: .pytest_cache 2024-08-06T22:03:42.2177899Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T22:03:42.2178479Z rootdir: /var/lib/jenkins/workspace 2024-08-06T22:03:42.2178772Z configfile: pytest.ini 2024-08-06T22:03:42.2179323Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T22:03:42.2179997Z collecting ... collected 912 items / 63 deselected / 849 selected 2024-08-06T22:03:42.2180936Z stepcurrent: skipping 63 already run items. Running only test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda 2024-08-06T22:03:42.2181746Z Running 1 items in this shard 2024-08-06T22:03:42.2181916Z 2024-08-06T22:03:42.2182709Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda SKIPPED [0.0002s] (requires multiple cuda devices) [100%] 2024-08-06T22:03:42.2183442Z 2024-08-06T22:03:42.2183998Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-a82215d487f8633b.xml - 2024-08-06T22:03:42.2185010Z ====================== 1 skipped, 63 deselected in 0.08s ======================= 2024-08-06T22:03:42.2185451Z Got exit code 0 2024-08-06T22:03:42.2185782Z Test succeeeded in new process, continuing with the rest of the tests 2024-08-06T22:03:42.2186495Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-24481963e654c7d1.xml 2024-08-06T22:03:42.2187233Z ============================= test session starts ============================== 2024-08-06T22:03:42.2187759Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-06T22:03:42.2188208Z cachedir: .pytest_cache 2024-08-06T22:03:42.2188746Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-06T22:03:42.2189337Z rootdir: /var/lib/jenkins/workspace 2024-08-06T22:03:42.2189609Z configfile: pytest.ini 2024-08-06T22:03:42.2190061Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-06T22:03:42.2190733Z collecting ... collected 912 items / 64 deselected / 848 selected 2024-08-06T22:03:42.2191137Z stepcurrent: skipping 64 already run items. 2024-08-06T22:03:42.2191441Z Running 0 items in this shard 2024-08-06T22:03:42.2191617Z 2024-08-06T22:03:42.2192614Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-24481963e654c7d1.xml - 2024-08-06T22:03:42.2194082Z ============================ 64 deselected in 0.08s ============================ 2024-08-06T22:03:42.2195067Z The following tests failed and then succeeded when run in a new process['ul', 'test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda'] 2024-08-06T22:03:42.2195831Z 2024-08-06T22:03:42.2196248Z FINISHED PRINTING LOG FILE of inductor/test_aot_inductor 9/16 (test/test-reports/inductor.test_aot_inductor_9.16_37c7619e1459b5df_.log) 2024-08-06T22:03:42.2196756Z 2024-08-06T22:03:45.3653081Z Running inductor/test_cuda_cpp_wrapper 4/5 ... [2024-08-06 22:03:45.364774] 2024-08-06T22:03:45.3655676Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_cpp_wrapper.py', '-m', 'not serial', '--shard-id=4', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:03:45.365166] 2024-08-06T22:09:31.6601267Z 2024-08-06T22:09:31.6602911Z inductor/test_torchinductor_dynamic_shapes 3/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_3.6_04d78c180807a766_.log 2024-08-06T22:09:31.6769103Z Running 224 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test__unsafe_masked_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_const_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_const_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_angle_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_dtype_device_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool3d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_buffer_use_after_remove_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_empty_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_upcasting_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_complex_memory_overlap_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_consecutive_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_consecutive_split_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_convolution5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_op_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_would_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dense_mask_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div9_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_bag_byte_unpack_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_bag_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fallback_mutable_op_basic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fill2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float_index_expression_type_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_floordiv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gather2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gelu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_horizonal_fusion2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_device_assert_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_select_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_activations_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_int_input_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_grid_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_offset_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_layer_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_leaky_relu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_low_memory_max_pool_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_masked_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_min_max_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_misaligned_address_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mixed_mm3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_gpu_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_threading_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nll_loss_forward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pattern_matcher_multi_user_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_j0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_logit_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reinterpret_dtypeview_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_no_ops_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_require_stride_expanded_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scalar_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sdpa_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_should_pad_bench_for_bmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sin_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_single_elem_indirect_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sort_stable_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_failed_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_with_integer_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_std_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_strided_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tmp_not_defined_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unsqueeze_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_upsample_cat_conv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vectorized_ops_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_on_aliased_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_xblock_divides_xnumel_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_zero_dim_reductions_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_const_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_const_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aoti_eager_support_out_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_min_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool3d_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cauchy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_concat_add_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv2d_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv_with_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_no_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_fixed_layout_sequential_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_data_type_propogation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dense_mask_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_diagonal_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dist_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_prim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout_trivial_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fuse_tiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fusing_write_into_disjoint_read_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gather3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gelu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_horizonal_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_as_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_insignificant_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_invalid_operand_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_isinf2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_kernel_names_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_logcumsumexp_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_low_memory_max_pool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_threading_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_sum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_nan_to_num_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_neg_max_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_new_cpp_build_logical_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_new_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pad_cast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_i1e_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_logit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_ndtri_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randint_kernel_count_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_generator_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_with_dtype_and_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_reduce3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_setitem_with_int_parameter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_should_pad_bench_for_bmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sign_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_silu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_simplify_loops_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sort_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumsum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze_varargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum_keepdims_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_nearest2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_vectorized_ops_masked_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_as_real_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_detach_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_adaptive_max_pool3d_with_indices_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_float_is_integer_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op7_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op8_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_nonzero_no_realloc_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_save_for_backwards_cuda 2024-08-06T22:09:31.6928835Z 2024-08-06T22:09:34.9971256Z Running inductor/test_inductor_freezing 1/1 ... [2024-08-06 22:09:34.996546] 2024-08-06T22:09:34.9973126Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_freezing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:09:34.996939] 2024-08-06T22:10:26.8828419Z 2024-08-06T22:10:26.8829939Z inductor/test_torchinductor_dynamic_shapes 4/6 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_4.6_fe53037062bc5225_.log 2024-08-06T22:10:26.8989829Z Running 221 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_support_out_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_baddbmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_batch_norm_2d_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bfloat16_to_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_both_scalars_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_default_kwargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_single_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_unbacked_empty_1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_nd_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_functional_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_inference_heuristics_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_convolution2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cumprod_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_op_multi_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout_trivial_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_fusion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_empty2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_expm1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fft_real_input_real_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fill1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float16_to_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fmod_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_full_boolean_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_generate_rand_fp8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_glu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_grid_sampler_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_hardtanh_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_horizonal_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_failed_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_fallback1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_fallback2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_indirect_load_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_insignificant_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_isinf2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_l1_loss_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_tensor_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_rands2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_rands3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linear1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linear2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_list_clearing_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_log_fp64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_long_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_masked_fill_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_new_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_new_ones_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_no_op_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pad_cast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_airy_ai_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_chebyshev_polynomial_u_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_erf_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_expm1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_ndtr_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pow1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_rsqrt_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_add3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_bf16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scheduler_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_select_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_silu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_simplify_loops_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tensor1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_to_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_triu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unroll_small_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_var_mean_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_as_complex_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_detach_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_inplace_permuted_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_any_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aoti_eager_dtype_device_layout_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_argmin_with_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d_backward3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_buffer_batch_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_buffer_use_after_remove_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_builtins_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_empty_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_single_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_compar_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_complex_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_computed_buffer_inlining_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_config_option_dont_assume_alignment_recompiles_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_consecutive_split_cumprod_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_consecutive_split_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv2d_backward_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv3d_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cudnn_rnn_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dist_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout_trivial_0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtype_sympy_expr_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_empty2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_exp2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_expand_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_expand_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_expm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fill1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_float_index_expression_type_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fractional_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_full_like_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_generate_rand_fp8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_glu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_hardswish_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_horizonal_fusion2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_fallback1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_indirect_load_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_activations_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_isinf_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_l1_loss_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_broadcast_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linear_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_logsumexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mixed_mm2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mul_index_expr_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_gpu_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_var_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_narrow_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_nll_loss_forward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_permute2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_bessel_j0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_bessel_y1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_chebyshev_polynomial_t_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_digamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_erfinv_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_modified_bessel_k0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_xlog1py_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_rand_like_deterministic_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randint_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction_config_limit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_relu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_repeat_interleave_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_round_correctness_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_reduce1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_mutation2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_softmax_one_kernel_loop_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sqrt_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_stack_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tan_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_to_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_to_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_topk_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_transposed_propagates_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_cat_conv_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_nearest2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_nearest3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_as_complex_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_on_aliased_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_xblock_divides_xnumel_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_zero_element_mutation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_constant_fold_uniform_value_dynamic_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_full_recompiles_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_unbacked_stride_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op0_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op5_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op6_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_slice_index_changing_sign_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_index_select_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unwrap_storage_didnt_work_repro_cuda 2024-08-06T22:10:26.9127751Z 2024-08-06T22:10:30.1101565Z Running inductor/test_fx_fusion 1/1 ... [2024-08-06 22:10:30.109545] 2024-08-06T22:10:30.1105063Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fx_fusion.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:10:30.109935] 2024-08-06T22:10:34.5341030Z 2024-08-06T22:10:34.5342563Z inductor/test_fx_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fx_fusion_1.1_dbaca286ee0bc292_.log 2024-08-06T22:10:34.5345446Z Running 4 items in this shard: test/inductor/test_fx_fusion.py::TestFxFusion::test_linear_permute_fusion, test/inductor/test_fx_fusion.py::TestFxFusion::test_permute_bmm_fusion, test/inductor/test_fx_fusion.py::TestFxFusion::test_permute_linear_fusion, test/inductor/test_fx_fusion.py::TestFxFusion::test_sink_cat_after_pointwise 2024-08-06T22:10:34.5347523Z 2024-08-06T22:10:37.7660263Z Running torch_np/test_dtype 1/1 ... [2024-08-06 22:10:37.765387] 2024-08-06T22:10:37.7662495Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_dtype.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:10:37.765812] 2024-08-06T22:10:41.3385632Z 2024-08-06T22:10:41.3386582Z torch_np/test_dtype 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_dtype_1.1_e4079a06cfe371cf_.log 2024-08-06T22:10:41.3400315Z Running 44 items in this shard: test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'bool_', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'complex128', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'complex64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'float16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'float32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'float64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_bool, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'bool_', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'complex128', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'complex64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'float16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'float32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'float64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.bool_, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.complex128, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.complex64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.dtype('bool'), test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.float16, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.float32, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.float64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int16, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int32, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int8, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint16, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint32, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint8 2024-08-06T22:10:41.3413013Z 2024-08-06T22:10:44.5393020Z Running test_deploy 1/1 ... [2024-08-06 22:10:44.538774] 2024-08-06T22:10:44.5396189Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_deploy.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:10:44.539192] 2024-08-06T22:10:48.1621641Z 2024-08-06T22:10:48.1622529Z test_deploy 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_deploy_1.1_3d3204ec08a4930d_.log 2024-08-06T22:10:48.1623316Z Running 1 items in this shard: test/test_deploy.py::TestFreezer::test_compile_string 2024-08-06T22:10:48.1623656Z 2024-08-06T22:10:51.4383634Z Running profiler/test_cpp_thread 1/1 ... [2024-08-06 22:10:51.437811] 2024-08-06T22:10:51.4385572Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_cpp_thread.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:10:51.438183] 2024-08-06T22:10:55.9620986Z 2024-08-06T22:10:55.9622050Z profiler/test_cpp_thread 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_cpp_thread_1.1_e137648a79de89f6_.log 2024-08-06T22:10:55.9623629Z Running 3 items in this shard: test/profiler/test_cpp_thread.py::CppThreadTest::test_profile_memory, test/profiler/test_cpp_thread.py::CppThreadTest::test_with_enable_profiler_in_child_thread, test/profiler/test_cpp_thread.py::CppThreadTest::test_without_enable_profiler_in_child_thread 2024-08-06T22:10:55.9624675Z 2024-08-06T22:10:59.1789603Z Running dynamo/test_exceptions 1/1 ... [2024-08-06 22:10:59.178417] 2024-08-06T22:10:59.1792405Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_exceptions.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:10:59.178837] 2024-08-06T22:11:00.3428273Z 2024-08-06T22:11:00.3429907Z inductor/test_inductor_freezing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_freezing_1.1_af5283ef1d76b32e_.log 2024-08-06T22:11:00.3444868Z Running 44 items in this shard: test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_aliased_param_return_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_autocast_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_bn_with_multi_bn_share_conv_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_functional_bn_with_multi_bn_share_conv_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_layout_convert_with_view_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_multiple_uses_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_weight_layout_convert_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_with_as_strided_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_cpp_wrapper_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_dont_change_dtype_folding_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_error_on_eager_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_folded_conv_bn_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_folded_conv_bn_with_module_sharing_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_folded_conv_functional_bn_with_module_sharing_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_mm_concat_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_mutation_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_param_deallocated_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_redundant_clone_for_layout_convert_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_rng_op_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_symint_not_folded_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_unequal_bias_horizontal_addmm_fusion_cpu, test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_unfolded_bn_cpu, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_aliased_param_return_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_autocast_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_conv_bn_with_multi_bn_share_conv_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_conv_functional_bn_with_multi_bn_share_conv_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_conv_layout_convert_with_view_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_conv_multiple_uses_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_conv_weight_layout_convert_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_conv_with_as_strided_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_cpp_wrapper_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_dont_change_dtype_folding_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_error_on_eager_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_folded_conv_bn_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_folded_conv_bn_with_module_sharing_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_folded_conv_functional_bn_with_module_sharing_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_mm_concat_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_mutation_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_param_deallocated_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_redundant_clone_for_layout_convert_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_rng_op_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_symint_not_folded_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_unequal_bias_horizontal_addmm_fusion_cuda, test/inductor/test_inductor_freezing.py::FreezingCudaTests::test_unfolded_bn_cuda 2024-08-06T22:11:00.3459540Z 2024-08-06T22:11:03.0011892Z 2024-08-06T22:11:03.0012954Z dynamo/test_exceptions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_exceptions_1.1_ff88573c60c242ad_.log 2024-08-06T22:11:03.0017754Z Running 14 items in this shard: test/dynamo/test_exceptions.py::ExceptionTests::test_custom_getattr_on_module_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_dynamo_undo_kw_names, test/dynamo/test_exceptions.py::ExceptionTests::test_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_exception2, test/dynamo/test_exceptions.py::ExceptionTests::test_exception3, test/dynamo/test_exceptions.py::ExceptionTests::test_exception4, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_else, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_raised_from_child, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_with_another_exception, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_with_another_exception2, test/dynamo/test_exceptions.py::ExceptionTests::test_exception_with_ctx_manager, test/dynamo/test_exceptions.py::ExceptionTests::test_key_error, test/dynamo/test_exceptions.py::ExceptionTests::test_nn_module_getattr, test/dynamo/test_exceptions.py::ExceptionTests::test_stop_iteration 2024-08-06T22:11:03.0021910Z 2024-08-06T22:11:03.5999841Z Running test_fx_experimental 1/1 ... [2024-08-06 22:11:03.599420] 2024-08-06T22:11:03.6001986Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_experimental.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-06 22:11:03.599835] 2024-08-06T22:11:23.3554626Z 2024-08-06T22:11:23.3556254Z test_fx_experimental 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_experimental_1.1_38d30b9876216252_.log 2024-08-06T22:11:23.4005324Z Running 714 items in this shard: test/test_fx_experimental.py::TestFXExperimental::test_annotate_getitem_node, test/test_fx_experimental.py::TestFXExperimental::test_annotate_returns_with_schema, test/test_fx_experimental.py::TestFXExperimental::test_aot_based_partition, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_no_msg, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_with_empty_msg, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_with_msg, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_with_multiline_message, test/test_fx_experimental.py::TestFXExperimental::test_conv_bn_fusion, test/test_fx_experimental.py::TestFXExperimental::test_conv_bn_fusion_mixed_dtype, test/test_fx_experimental.py::TestFXExperimental::test_conv_bn_fusion_not_running_state, test/test_fx_experimental.py::TestFXExperimental::test_cost_aware_partition, test/test_fx_experimental.py::TestFXExperimental::test_fetch, test/test_fx_experimental.py::TestFXExperimental::test_find_single_partition, test/test_fx_experimental.py::TestFXExperimental::test_lack_of_devices, test/test_fx_experimental.py::TestFXExperimental::test_large_node_error, test/test_fx_experimental.py::TestFXExperimental::test_merge_matmuls, test/test_fx_experimental.py::TestFXExperimental::test_meta_tracer, test/test_fx_experimental.py::TestFXExperimental::test_normalize_args, test/test_fx_experimental.py::TestFXExperimental::test_normalize_args_perserve_type, test/test_fx_experimental.py::TestFXExperimental::test_normalize_args_preserve_meta, test/test_fx_experimental.py::TestFXExperimental::test_normalize_binary_operators, test/test_fx_experimental.py::TestFXExperimental::test_normalize_modules_exhaustive, test/test_fx_experimental.py::TestFXExperimental::test_optimize_for_inference_cpu, test/test_fx_experimental.py::TestFXExperimental::test_optimize_for_inference_cpu_torchvision, test/test_fx_experimental.py::TestFXExperimental::test_partition_device_mapping, test/test_fx_experimental.py::TestFXExperimental::test_partition_latency, test/test_fx_experimental.py::TestFXExperimental::test_partition_node_manipulation, test/test_fx_experimental.py::TestFXExperimental::test_replace_target_nodes_with, test/test_fx_experimental.py::TestFXExperimental::test_saturate_host, test/test_fx_experimental.py::TestFXExperimental::test_size_based_partition, test/test_fx_experimental.py::TestFXExperimental::test_sparse_nn_partition, test/test_fx_experimental.py::TestFXExperimental::test_split_module_dead_code, test/test_fx_experimental.py::TestFXExperimental::test_split_module_default_arg, test/test_fx_experimental.py::TestFXExperimental::test_split_module_kwargs_expansion, test/test_fx_experimental.py::TestFXExperimental::test_split_qualname_mapping, test/test_fx_experimental.py::TestFXExperimental::test_subgraph_creation, test/test_fx_experimental.py::TestFXExperimental::test_subgraph_trivial_resnet, test/test_fx_experimental.py::TestFXExperimental::test_subgraph_uniquename, test/test_fx_experimental.py::TestFXExperimental::test_to_folder, test/test_fx_experimental.py::TestFXExperimental::test_traceable_function_with_nonstandard_name, test/test_fx_experimental.py::TestFXExperimental::test_type_matches, test/test_fx_experimental.py::TestTranslationValidation::test_sat, test/test_fx_experimental.py::TestTranslationValidation::test_sympy_to_z3, test/test_fx_experimental.py::TestTranslationValidation::test_unsat, test/test_fx_experimental.py::TestTranslationValidation::test_z3str, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_args_op_overload_cuda, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_H_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_T_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___getitem___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___radd___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rdiv___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rmatmul___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rmod___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rmul___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rpow___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rsub___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__batch_norm_with_update_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__chunk_cat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__native_batch_norm_legit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__segment_reduce_lengths_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__segment_reduce_offsets_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__softmax_backward_data_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__unsafe_masked_index_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__unsafe_masked_index_put_accumulate_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__upsample_bilinear2d_aa_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_abs_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_acos_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_acosh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_add_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addbmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addcdiv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addcmul_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addmm_decomposed_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addmv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_alias_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_all_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_allclose_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_aminmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_angle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_any_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_arange_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argsort_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argwhere_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_partial_views_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_asin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_asinh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atan2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atan_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atanh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atleast_1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atleast_2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atleast_3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_baddbmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bernoulli_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bfloat16_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_block_diag_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bool_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_broadcast_shapes_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_broadcast_tensors_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_broadcast_to_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bucketize_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_byte_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cartesian_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cauchy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cdist_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cdouble_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ceil_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cfloat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_chalf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_char_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cholesky_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cholesky_inverse_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cholesky_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_chunk_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clamp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clamp_max_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clamp_min_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clone_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_column_stack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_combinations_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_complex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_conj_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_conj_physical_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_constant_pad_nd_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_contiguous_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_copysign_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_corrcoef_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cos_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cosh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_count_nonzero_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cov_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cross_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cummax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cummin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cumprod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cumsum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cumulative_trapezoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_deg2rad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diag_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diag_embed_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagflat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagonal_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagonal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagonal_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diff_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_digamma_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dist_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_div_floor_rounding_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_div_no_rounding_mode_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_div_trunc_rounding_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_double_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dsplit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dstack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_einsum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_permuted_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_strided_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_eq_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_equal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_erf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_erfc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_erfinv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_exp2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_exp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expand_as_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expand_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expand_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expm1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_exponential_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_eye_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fftshift_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_hfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_hfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_hfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifftshift_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ihfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ihfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ihfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_irfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_irfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_irfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_rfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_rfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_rfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fill_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_flatten_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_flip_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fliplr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_flipud_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_float_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_float_power_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_floor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_floor_divide_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fmod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_frac_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_frexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_full_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_full_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_gather_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ge_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_geometric_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_geqrf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_gradient_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_grid_sampler_2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_gt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_half_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_heaviside_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_histc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_hsplit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_hstack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_hypot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_i0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_igamma_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_igammac_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_add_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_fill_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_put_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_select_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_inner_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_int_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isclose_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isfinite_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isinf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isnan_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isneginf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isposinf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isreal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_item_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_2inputs_2outputs_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_4inputs_with_extra_args_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_binary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_binary_return_by_ref_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_unary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_kron_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_kthvalue_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ldexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_le_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lerp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lgamma_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cholesky_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cholesky_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cond_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cross_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_det_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_det_singular_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_diagonal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eig_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eigh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eigvals_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eigvalsh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_householder_product_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_inv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_inv_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_ldl_factor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_ldl_factor_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_ldl_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lstsq_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lstsq_grad_oriented_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_factor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_factor_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_power_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_rank_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_rank_hermitian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_multi_dot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_norm_subgradients_at_zero_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_pinv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_pinv_hermitian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_pinv_singular_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_qr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_slogdet_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_solve_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_solve_triangular_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_svd_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_svdvals_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_tensorinv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_tensorsolve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_vander_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_vecdot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_vector_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linspace_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linspace_tensor_overload_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log10_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log1p_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_normal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_softmax_with_dtype_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logaddexp2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logaddexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logcumsumexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logdet_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_and_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_not_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_or_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_xor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logspace_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logspace_tensor_overload_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logsumexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_long_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lu_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lu_unpack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mH_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mT_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_argmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_argmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_cumprod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_cumsum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_fill_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_log_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_logaddexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_logsumexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_median_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_normalize_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_select_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_softmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_std_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_sum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_var_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_matmul_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_matrix_exp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_binary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_pool2d_with_indices_backward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_reduction_no_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_reduction_with_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_maximum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_median_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_meshgrid_list_of_tensors_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_meshgrid_variadic_tensors_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_min_binary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_min_reduction_no_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_min_reduction_with_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_minimum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mode_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_movedim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_msort_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mul_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_multinomial_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nan_to_num_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nanmean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nanmedian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nanquantile_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nansum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_narrow_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_narrow_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_native_batch_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_native_dropout_backward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_native_layer_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ne_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_neg_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_empty_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_empty_strided_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_full_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_ones_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_zeros_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nextafter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_alpha_dropout_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_avg_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_avg_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_avg_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_batch_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_bilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_binary_cross_entropy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_celu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_channel_shuffle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv_transpose1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv_transpose2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv_transpose3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_cosine_embedding_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_cosine_similarity_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_cross_entropy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_ctc_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_dropout2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_dropout3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_dropout_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_elu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_embedding_bag_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_embedding_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_fractional_max_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_fractional_max_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_gaussian_nll_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_gelu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_glu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_grid_sample_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_group_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardshrink_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardsigmoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardswish_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardtanh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hinge_embedding_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_huber_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_instance_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_area_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_bicubic_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_bilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_linear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_nearest_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_trilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_kl_div_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_l1_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_layer_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_leaky_relu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_linear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_local_response_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_logsigmoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_margin_ranking_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool1d_grad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool2d_grad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool3d_grad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_mish_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_mse_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multi_head_attention_forward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multi_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multilabel_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_nll_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_normalize_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_circular_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_constant_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_reflect_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_replicate_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_replicate_negative_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pairwise_distance_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pdist_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pixel_shuffle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pixel_unshuffle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_poisson_nll_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_prelu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_relu6_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_relu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_rms_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_rrelu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_selu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_silu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_smooth_l1_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_soft_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softmin_with_dtype_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softplus_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softshrink_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softsign_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_tanhshrink_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_threshold_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_triplet_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_unfold_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_upsample_bilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_upsample_nearest_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nonzero_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nonzero_static_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_fro_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_inf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_nuc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_normal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_normal_in_place_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_normal_number_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ones_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ones_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ormqr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_outer_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_pca_lowrank_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_permute_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_pinverse_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polar_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_4_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_positive_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_pow_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_put_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_qr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_quantile_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rad2deg_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rand_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randint_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randint_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randn_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ravel_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_real_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_reciprocal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_remainder_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_renorm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_repeat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_repeat_interleave_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_reshape_as_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_reshape_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resize__cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resize_as__cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resolve_conj_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resolve_neg_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_roll_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rot90_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_decimals_0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_decimals_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_decimals_neg_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rsqrt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rsub_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scalar_tensor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_add_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_sum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_searchsorted_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_select_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_select_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sgn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_short_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sigmoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sign_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_bartlett_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_blackman_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_cosine_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_exponential_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_gaussian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_general_cosine_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_general_hamming_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_hamming_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_hann_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_kaiser_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_nuttall_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signbit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sinc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sinh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_slice_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_slice_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_softmax_with_dtype_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sort_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sparse_mm_reduce_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sparse_sampled_addmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_airy_ai_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_j0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_j1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_y0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_y1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_t_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_u_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_v_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_w_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_entr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_erfcx_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_hermite_polynomial_h_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_hermite_polynomial_he_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_i0e_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_i1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_i1e_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_laguerre_polynomial_l_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_legendre_polynomial_p_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_log_ndtr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_i0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_i1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_k0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_k1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_ndtr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_ndtri_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_scaled_modified_bessel_k0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_scaled_modified_bessel_k1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_spherical_bessel_j0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_xlog1py_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_zeta_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_list_args_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_with_sizes_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_with_sizes_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sqrt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_square_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_squeeze_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_squeeze_multiple_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_stack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_mean_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_stft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sub_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sum_to_size_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_svd_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_svd_lowrank_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_t_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_t_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_take_along_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_take_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tan_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tanh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tensor_split_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tensordot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tile_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_to_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_to_sparse_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_topk_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trace_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_transpose_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trapezoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trapz_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_triangular_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tril_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_triu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_true_divide_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trunc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unbind_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unflatten_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unfold_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unfold_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_uniform_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unique_consecutive_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unique_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsafe_chunk_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsafe_split_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsqueeze_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsqueeze_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_mean_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_vdot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_as_complex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_as_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_vsplit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_vstack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_where_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_xlogy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_zero__cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_zeros_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_zeros_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_quantized_eb_cuda 2024-08-06T22:11:23.4309835Z 2024-08-06T22:18:28.3867163Z 2024-08-06T22:18:28.3868421Z inductor/test_cuda_cpp_wrapper 4/5 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_cpp_wrapper_4.5_f865a2da5bca2dfd_.log 2024-08-06T22:18:28.3879225Z Running 21 items in this shard: test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_add_complex4_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_cat_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_dtypeview_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_dtypeview_fusion_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_linear2_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_linear_relu_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_multi_threading_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_pointwise_hermite_polynomial_he_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_scaled_dot_product_attention_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_silu_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_sum_int_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_bernoulli1_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_buffer_use_after_remove_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_conv_backward_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_convolution1_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_embedding_bag_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_index_tensor_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_repeat_interleave_2_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_scalar_input_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_silu_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_unspec_inputs_cuda_dynamic_shapes_cuda_wrapper 2024-08-06T22:18:28.3890467Z 2024-08-06T22:18:29.6760405Z 2024-08-06T22:18:29.6761052Z real 83m34.478s 2024-08-06T22:18:29.6761864Z user 164m24.935s 2024-08-06T22:18:29.6762084Z sys 12m12.243s 2024-08-06T22:18:29.6762299Z + assert_git_not_dirty 2024-08-06T22:18:29.6762599Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-06T22:18:29.6762978Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *xla* ]] 2024-08-06T22:18:29.6769757Z ++ git status --porcelain 2024-08-06T22:18:29.6770453Z ++ grep -v '?? third_party' 2024-08-06T22:18:32.0931918Z ++ true 2024-08-06T22:18:32.0935492Z + git_status= 2024-08-06T22:18:32.0935829Z + [[ -n '' ]] 2024-08-06T22:18:32.0936109Z + test_libtorch 2 2024-08-06T22:18:32.0936415Z + local SHARD=2 2024-08-06T22:18:32.0936715Z + [[ default != \s\l\o\w ]] 2024-08-06T22:18:32.0937343Z + echo 'Testing libtorch' 2024-08-06T22:18:32.0937650Z Testing libtorch 2024-08-06T22:18:32.0938748Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libbackend_with_compiler.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.0955051Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libjitbackend_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.0969227Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libcaffe2_nvrtc.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.0984675Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10d_cuda_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.1000051Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libshm.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.1016664Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda_linalg.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_python.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorchbind_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.1031851Z + ln -sf '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libnvfuser*' /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.1045516Z + export CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.1046227Z + CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:18:32.1046636Z + [[ -z 2 ]] 2024-08-06T22:18:32.1046826Z + [[ 2 == \1 ]] 2024-08-06T22:18:32.1047022Z + [[ -z 2 ]] 2024-08-06T22:18:32.1047212Z + [[ 2 == \2 ]] 2024-08-06T22:18:32.1047407Z + test_libtorch_jit 2024-08-06T22:18:32.1047616Z + pushd test 2024-08-06T22:18:32.1047827Z ~/workspace/test ~/workspace 2024-08-06T22:18:32.1048094Z + python cpp/jit/tests_setup.py setup 2024-08-06T22:18:33.9631307Z + popd 2024-08-06T22:18:33.9631560Z ~/workspace 2024-08-06T22:18:33.9631841Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 == *cuda* ]] 2024-08-06T22:18:33.9632175Z + [[ default != *nogpu* ]] 2024-08-06T22:18:33.9632440Z + LTC_TS_CUDA=1 2024-08-06T22:18:33.9632772Z + python test/run_test.py --cpp --verbose -i cpp/test_jit cpp/test_lazy 2024-08-06T22:18:34.0596910Z /var/lib/jenkins/workspace/test/run_test.py:21: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html 2024-08-06T22:18:34.0597681Z import pkg_resources 2024-08-06T22:18:37.5432050Z Downloading https://ossci-metrics.s3.amazonaws.com/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2024-08-06T22:18:37.5433861Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2024-08-06T22:18:37.5539151Z Found test times from artifacts 2024-08-06T22:18:37.5959964Z Found test times from artifacts 2024-08-06T22:18:37.5973422Z Running 25% of tests based on TD 2024-08-06T22:18:37.5977060Z Running parallel tests on 3 processes 2024-08-06T22:18:37.5977381Z Name: tests to run (est. time: 0.0min) 2024-08-06T22:18:37.5977665Z Serial tests (0): 2024-08-06T22:18:37.5977891Z Parallel tests (1): 2024-08-06T22:18:37.5978112Z cpp/test_jit 1/1 2024-08-06T22:18:37.5978350Z Name: excluded (est. time: 0.0min) 2024-08-06T22:18:37.5978743Z Serial tests (0): 2024-08-06T22:18:37.5978956Z Parallel tests (1): 2024-08-06T22:18:37.5979225Z cpp/test_lazy 1/1 2024-08-06T22:18:37.5979649Z Starting test batch 'tests to run' 0.0 seconds after initiating testing 2024-08-06T22:18:37.6070148Z Running cpp/test_jit 1/1 ... [2024-08-06 22:18:37.606481] 2024-08-06T22:18:37.6075743Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_jit', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-1c20af093c6dce74.xml', '-x', '--reruns=2'] ... [2024-08-06 22:18:37.607113] 2024-08-06T22:18:40.2290203Z 2024-08-06T22:18:40.2291744Z cpp/test_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_jit_1.1_a9c1e86c9f5c6bc3_.log 2024-08-06T22:18:40.2292776Z 2024-08-06T22:18:40.2296626Z Running cpp/test_jit 1/1 ... [2024-08-06 22:18:40.229347] 2024-08-06T22:18:40.2302732Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_jit', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-ccce35c7353b9a78.xml', '-x', '--reruns=2'] ... [2024-08-06 22:18:40.229917] 2024-08-06T22:20:50.3741115Z 2024-08-06T22:20:50.3742362Z cpp/test_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_jit_1.1_ef7dc05941bfe9ce_.log 2024-08-06T22:20:50.3748832Z 2024-08-06T22:20:51.4773444Z + pushd test 2024-08-06T22:20:51.4773709Z ~/workspace/test ~/workspace 2024-08-06T22:20:51.4774014Z + python cpp/jit/tests_setup.py shutdown 2024-08-06T22:20:53.1386581Z + popd 2024-08-06T22:20:53.1386820Z ~/workspace 2024-08-06T22:20:53.1387030Z + assert_git_not_dirty 2024-08-06T22:20:53.1387377Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-06T22:20:53.1387757Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *xla* ]] 2024-08-06T22:20:53.1395030Z ++ git status --porcelain 2024-08-06T22:20:53.1395455Z ++ grep -v '?? third_party' 2024-08-06T22:20:53.3791943Z ++ true 2024-08-06T22:20:53.3794141Z + git_status= 2024-08-06T22:20:53.3794571Z + [[ -n '' ]] 2024-08-06T22:20:53.3794887Z + test_aot_compilation 2024-08-06T22:20:53.3795230Z + echo 'Testing Ahead of Time compilation' 2024-08-06T22:20:53.3795652Z Testing Ahead of Time compilation 2024-08-06T22:20:53.3797045Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10d_cuda_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:20:53.3817489Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda_linalg.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_python.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorchbind_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:20:53.3837183Z + '[' -f /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_mobile_nnc ']' 2024-08-06T22:20:53.3837826Z + CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:20:53.3838323Z + python test/run_test.py --cpp --verbose -i cpp/test_mobile_nnc 2024-08-06T22:20:53.4804068Z /var/lib/jenkins/workspace/test/run_test.py:21: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html 2024-08-06T22:20:53.4804830Z import pkg_resources 2024-08-06T22:20:56.9743951Z Downloading https://ossci-metrics.s3.amazonaws.com/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2024-08-06T22:20:56.9745459Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2024-08-06T22:20:56.9853540Z Found test times from artifacts 2024-08-06T22:20:57.0273723Z Found test times from artifacts 2024-08-06T22:20:57.0286935Z Running 25% of tests based on TD 2024-08-06T22:20:57.0290465Z Running parallel tests on 3 processes 2024-08-06T22:20:57.0290767Z Name: tests to run (est. time: 0.0min) 2024-08-06T22:20:57.0291058Z Serial tests (0): 2024-08-06T22:20:57.0291282Z Parallel tests (1): 2024-08-06T22:20:57.0291511Z cpp/test_mobile_nnc 1/1 2024-08-06T22:20:57.0291769Z Name: excluded (est. time: 0.0min) 2024-08-06T22:20:57.0292192Z Serial tests (0): 2024-08-06T22:20:57.0292619Z Parallel tests (0): 2024-08-06T22:20:57.0292956Z Starting test batch 'tests to run' 0.0 seconds after initiating testing 2024-08-06T22:20:57.0383396Z Running cpp/test_mobile_nnc 1/1 ... [2024-08-06 22:20:57.037835] 2024-08-06T22:20:57.0388398Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_mobile_nnc', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-29ee2191a9be390a.xml', '-x', '--reruns=2'] ... [2024-08-06 22:20:57.038375] 2024-08-06T22:20:59.1579856Z 2024-08-06T22:20:59.1581315Z cpp/test_mobile_nnc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_mobile_nnc_1.1_ec2cd48ecc749370_.log 2024-08-06T22:20:59.1582242Z 2024-08-06T22:20:59.5618106Z Running cpp/test_mobile_nnc 1/1 ... [2024-08-06 22:20:59.561153] 2024-08-06T22:20:59.5621049Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_mobile_nnc', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-d7cd70354a2e0823.xml', '-x', '--reruns=2'] ... [2024-08-06 22:20:59.561675] 2024-08-06T22:21:03.2342345Z 2024-08-06T22:21:03.2343106Z cpp/test_mobile_nnc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_mobile_nnc_1.1_d632303c92be5540_.log 2024-08-06T22:21:03.2343675Z 2024-08-06T22:21:04.3431289Z + '[' -f /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/aot_model_compiler_test ']' 2024-08-06T22:21:04.3431833Z + source test/mobile/nnc/test_aot_compile.sh 2024-08-06T22:21:04.3432138Z ++ set -e -o pipefail 2024-08-06T22:21:04.3435321Z +++ python -c 'import site; print(site.getsitepackages()[0])' 2024-08-06T22:21:04.3611926Z ++ TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2024-08-06T22:21:04.3612482Z ++ TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-06T22:21:04.3616252Z +++ dirname test/mobile/nnc/test_aot_compile.sh 2024-08-06T22:21:04.3631985Z ++ CURRENT_DIR=test/mobile/nnc 2024-08-06T22:21:04.3632302Z ++ MODEL=aot_test_model.pt 2024-08-06T22:21:04.3632615Z ++ COMPILED_MODEL=aot_test_model.compiled.pt 2024-08-06T22:21:04.3632936Z ++ COMPILED_CODE=aot_test_model.compiled.ll 2024-08-06T22:21:04.3635801Z +++ mktemp -d -t build_XXX 2024-08-06T22:21:04.3652674Z ++ TMP_DIR=/tmp/build_k2k 2024-08-06T22:21:04.3653203Z + test_custom_script_ops 2024-08-06T22:21:04.3653793Z + echo 'Testing custom script operators' 2024-08-06T22:21:04.3654337Z Testing custom script operators 2024-08-06T22:21:04.3654960Z + CUSTOM_OP_BUILD=/var/lib/jenkins/workspace/build/custom_test_artifacts/custom-op-build 2024-08-06T22:21:04.3655584Z + pushd test/custom_operator 2024-08-06T22:21:04.3655860Z ~/workspace/test/custom_operator ~/workspace 2024-08-06T22:21:04.3656329Z + cp -a /var/lib/jenkins/workspace/build/custom_test_artifacts/custom-op-build build 2024-08-06T22:21:04.3830175Z + python test_custom_ops.py -v 2024-08-06T22:21:06.7323957Z /var/lib/jenkins/workspace/test/custom_operator/pointwise.py:11: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-06T22:21:06.7325495Z @torch.library.impl_abstract("custom::cos") 2024-08-06T22:21:06.7510591Z /var/lib/jenkins/workspace/test/custom_operator/pointwise.py:17: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-06T22:21:06.7511678Z @torch.library.impl_abstract("custom::tan") 2024-08-06T22:21:06.7538402Z Test results will be stored in test-reports/python-unittest/test_custom_ops 2024-08-06T22:21:06.7555045Z 2024-08-06T22:21:06.7555386Z Running tests... 2024-08-06T22:21:06.7555763Z ---------------------------------------------------------------------- 2024-08-06T22:21:06.8177613Z test_abstract_impl_pystub_faketensor (__main__.TestCustomOperators) ... /var/lib/jenkins/workspace/test/custom_operator/my_custom_ops.py:9: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-06T22:21:06.8178973Z @torch.library.impl_abstract("custom::nonzero") 2024-08-06T22:21:06.8264595Z /var/lib/jenkins/workspace/test/custom_operator/my_custom_ops.py:13: FutureWarning: `create_unbacked_symint` is deprecated, please use `new_dynamic_size` instead 2024-08-06T22:21:06.8265477Z nnz = ctx.create_unbacked_symint() 2024-08-06T22:21:06.8340366Z ok (0.078s) 2024-08-06T22:21:06.8357330Z test_abstract_impl_pystub_meta (__main__.TestCustomOperators) ... /var/lib/jenkins/workspace/test/custom_operator/my_custom_ops2.py:9: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-06T22:21:06.8358650Z @torch.library.impl_abstract("custom::sin") 2024-08-06T22:21:06.8379796Z ok (0.004s) 2024-08-06T22:21:06.8402486Z test_calling_custom_op (__main__.TestCustomOperators) ... ok (0.002s) 2024-08-06T22:21:06.8874005Z test_calling_custom_op_inside_script_module (__main__.TestCustomOperators) ... ok (0.047s) 2024-08-06T22:21:06.8882263Z test_calling_custom_op_string (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-06T22:21:06.8913914Z test_calling_custom_op_with_autograd (__main__.TestCustomOperators) ... /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:812: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1203.) 2024-08-06T22:21:06.8916308Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2024-08-06T22:21:06.8925455Z ok (0.004s) 2024-08-06T22:21:06.8933520Z test_calling_custom_op_with_autograd_in_nograd_mode (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-06T22:21:06.8937934Z test_custom_library_is_loaded (__main__.TestCustomOperators) ... ok (0.000s) 2024-08-06T22:21:07.0931870Z test_dynamo_pystub_suggestion (__main__.TestCustomOperators) ... ok (0.199s) 2024-08-06T22:21:07.2236332Z test_op_with_incorrect_abstract_impl_pystub (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-06T22:21:07.2244210Z test_op_with_no_abstract_impl_pystub (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-06T22:21:07.2329894Z test_saving_and_loading_script_module_with_custom_op (__main__.TestCustomOperators) ... ok (0.008s) 2024-08-06T22:21:07.2330465Z 2024-08-06T22:21:07.2330662Z ---------------------------------------------------------------------- 2024-08-06T22:21:07.2331177Z Ran 12 tests in 0.477s 2024-08-06T22:21:07.2331320Z 2024-08-06T22:21:07.2331400Z OK 2024-08-06T22:21:07.2331499Z 2024-08-06T22:21:07.2331595Z Generating XML reports... 2024-08-06T22:21:07.2377148Z Generated XML report: test-reports/python-unittest/test_custom_ops/TEST-TestCustomOperators-20240806222106.xml 2024-08-06T22:21:07.8386008Z + python model.py --export-script-module=model.pt 2024-08-06T22:21:09.5083037Z + build/test_custom_ops ./model.pt 2024-08-06T22:21:09.9950886Z [W806 22:21:09.515261057 engine.cpp:1203] Warning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (function operator()) 2024-08-06T22:21:10.3314271Z ok 2024-08-06T22:21:10.5418312Z + popd 2024-08-06T22:21:10.5418641Z ~/workspace 2024-08-06T22:21:10.5418952Z + assert_git_not_dirty 2024-08-06T22:21:10.5419384Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-06T22:21:10.5419928Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *xla* ]] 2024-08-06T22:21:10.5426972Z ++ git status --porcelain 2024-08-06T22:21:10.5427702Z ++ grep -v '?? third_party' 2024-08-06T22:21:10.7841402Z ++ true 2024-08-06T22:21:10.7843032Z + git_status= 2024-08-06T22:21:10.7843322Z + [[ -n '' ]] 2024-08-06T22:21:10.7843613Z + test_custom_backend 2024-08-06T22:21:10.7843943Z + echo 'Testing custom backends' 2024-08-06T22:21:10.7844294Z Testing custom backends 2024-08-06T22:21:10.7844907Z + CUSTOM_BACKEND_BUILD=/var/lib/jenkins/workspace/build/custom_test_artifacts/custom-backend-build 2024-08-06T22:21:10.7845427Z + pushd test/custom_backend 2024-08-06T22:21:10.7845739Z ~/workspace/test/custom_backend ~/workspace 2024-08-06T22:21:10.7846251Z + cp -a /var/lib/jenkins/workspace/build/custom_test_artifacts/custom-backend-build build 2024-08-06T22:21:10.8006979Z + python test_custom_backend.py -v 2024-08-06T22:21:13.1612568Z Test results will be stored in test-reports/python-unittest/test_custom_backend 2024-08-06T22:21:13.1623851Z 2024-08-06T22:21:13.1624397Z Running tests... 2024-08-06T22:21:13.1624798Z ---------------------------------------------------------------------- 2024-08-06T22:21:13.1628996Z test_execute (__main__.TestCustomBackend) 2024-08-06T22:21:13.2179172Z Test execution using the custom backend. ... ok (0.055s) 2024-08-06T22:21:13.2183750Z test_save_load (__main__.TestCustomBackend) 2024-08-06T22:21:13.2380600Z Test that a lowered module can be executed correctly ... ok (0.020s) 2024-08-06T22:21:13.2381009Z 2024-08-06T22:21:13.2381218Z ---------------------------------------------------------------------- 2024-08-06T22:21:13.2381661Z Ran 2 tests in 0.076s 2024-08-06T22:21:13.2381831Z 2024-08-06T22:21:13.2381911Z OK 2024-08-06T22:21:13.2382012Z 2024-08-06T22:21:13.2382120Z Generating XML reports... 2024-08-06T22:21:13.2407965Z Generated XML report: test-reports/python-unittest/test_custom_backend/TEST-TestCustomBackend-20240806222113.xml 2024-08-06T22:21:13.7334645Z + python backend.py --export-module-to=model.pt 2024-08-06T22:21:15.4435510Z + build/test_custom_backend ./model.pt 2024-08-06T22:21:15.9300285Z Testing custom_backend 2024-08-06T22:21:15.9985674Z OK 2024-08-06T22:21:16.1162839Z + rm -f ./model.pt 2024-08-06T22:21:16.1178373Z + popd 2024-08-06T22:21:16.1178943Z ~/workspace 2024-08-06T22:21:16.1179237Z + assert_git_not_dirty 2024-08-06T22:21:16.1179642Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-06T22:21:16.1180127Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *xla* ]] 2024-08-06T22:21:16.1186471Z ++ git status --porcelain 2024-08-06T22:21:16.1186853Z ++ grep -v '?? third_party' 2024-08-06T22:21:16.3592907Z ++ true 2024-08-06T22:21:16.3594944Z + git_status= 2024-08-06T22:21:16.3595641Z + [[ -n '' ]] 2024-08-06T22:21:16.3596144Z + test_torch_function_benchmark 2024-08-06T22:21:16.3596572Z + echo 'Testing __torch_function__ benchmarks' 2024-08-06T22:21:16.3597264Z Testing __torch_function__ benchmarks 2024-08-06T22:21:16.3597622Z + pushd benchmarks/overrides_benchmark 2024-08-06T22:21:16.3598073Z ~/workspace/benchmarks/overrides_benchmark ~/workspace 2024-08-06T22:21:16.3598434Z + python bench.py -n 1 -m 2 2024-08-06T22:21:17.7242350Z Type tensor had a minimum time of 0.005245208740234375 us and a standard deviation of 0.3536963486112654 us. 2024-08-06T22:21:17.7243207Z Type SubTensor had a minimum time of 0.012636184692382812 us and a standard deviation of 0.03692063910420984 us. 2024-08-06T22:21:17.7243971Z Type WithTorchFunction had a minimum time of 0.0059604644775390625 us and a standard deviation of 0.01787026303645689 us. 2024-08-06T22:21:17.7244780Z Type SubWithTorchFunction had a minimum time of 0.010728836059570312 us and a standard deviation of 0.0065749081841204315 us. 2024-08-06T22:21:18.0057308Z + python pyspybench.py Tensor -n 1 2024-08-06T22:21:19.6560752Z + python pyspybench.py SubTensor -n 1 2024-08-06T22:21:21.3022039Z + python pyspybench.py WithTorchFunction -n 1 2024-08-06T22:21:22.9403113Z + python pyspybench.py SubWithTorchFunction -n 1 2024-08-06T22:21:24.5763898Z + popd 2024-08-06T22:21:24.5764123Z ~/workspace 2024-08-06T22:21:24.5764334Z + assert_git_not_dirty 2024-08-06T22:21:24.5764644Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-06T22:21:24.5765039Z + [[ linux-focal-cuda12.1-py3.10-gcc9-sm86 != *xla* ]] 2024-08-06T22:21:24.5772323Z ++ git status --porcelain 2024-08-06T22:21:24.5772665Z ++ grep -v '?? third_party' 2024-08-06T22:21:24.8177038Z ++ true 2024-08-06T22:21:24.8179727Z + git_status= 2024-08-06T22:21:24.8180160Z + [[ -n '' ]] 2024-08-06T22:21:24.8181013Z + cleanup_workspace 2024-08-06T22:21:24.8181649Z + echo 'sudo may print the following warning message that can be ignored. The chown command will still run.' 2024-08-06T22:21:24.8182495Z sudo may print the following warning message that can be ignored. The chown command will still run. 2024-08-06T22:21:24.8183070Z + echo ' sudo: setrlimit(RLIMIT_STACK): Operation not permitted' 2024-08-06T22:21:24.8183478Z sudo: setrlimit(RLIMIT_STACK): Operation not permitted 2024-08-06T22:21:24.8183966Z + echo 'For more details refer to https://github.com/sudo-project/sudo/issues/42' 2024-08-06T22:21:24.8184501Z For more details refer to https://github.com/sudo-project/sudo/issues/42 2024-08-06T22:21:24.8184919Z + sudo chown -R 1000 /var/lib/jenkins/workspace 2024-08-06T22:21:25.5580016Z ##[group]Run cat test/**/*_toprint.log || true 2024-08-06T22:21:25.5580383Z cat test/**/*_toprint.log || true 2024-08-06T22:21:25.5594778Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:25.5595236Z env: 2024-08-06T22:21:25.5595498Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:25.5595859Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:25.5596357Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:25.5596802Z ##[endgroup] 2024-08-06T22:21:25.5689430Z cat: 'test/**/*_toprint.log': No such file or directory 2024-08-06T22:21:25.5724700Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2024-08-06T22:21:25.5725021Z kill "$MONITOR_SCRIPT_PID" 2024-08-06T22:21:25.5732884Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:25.5733219Z env: 2024-08-06T22:21:25.5733414Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:25.5733714Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:25.5734343Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:25.5734798Z MONITOR_SCRIPT_PID: 557221 2024-08-06T22:21:25.5735042Z ##[endgroup] 2024-08-06T22:21:25.5865930Z Prepare all required actions 2024-08-06T22:21:25.5866293Z Getting action download info 2024-08-06T22:21:25.7973391Z Download action repository 'actions/upload-artifact@v3' (SHA:a8a3f3ad30e3422c9c7b888a15615d19a852ae32) 2024-08-06T22:21:26.0066678Z ##[group]Run ./.github/actions/upload-test-artifacts 2024-08-06T22:21:26.0067103Z with: 2024-08-06T22:21:26.0067455Z file-suffix: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080 2024-08-06T22:21:26.0067900Z s3-bucket: gha-artifacts 2024-08-06T22:21:26.0068137Z env: 2024-08-06T22:21:26.0068323Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:26.0068625Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:26.0069123Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:26.0069555Z ##[endgroup] 2024-08-06T22:21:26.0121125Z ##[group]Run # Remove any previous test jsons if they exist 2024-08-06T22:21:26.0121573Z # Remove any previous test jsons if they exist 2024-08-06T22:21:26.0121901Z rm -f test-jsons-*.zip 2024-08-06T22:21:26.0122241Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json' 2024-08-06T22:21:26.0131065Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:26.0131397Z env: 2024-08-06T22:21:26.0131599Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:26.0131899Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:26.0132386Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:26.0132983Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080 2024-08-06T22:21:26.0133401Z ##[endgroup] 2024-08-06T22:21:26.0429349Z adding: test/allowlist_for_publicAPI.json (deflated 79%) 2024-08-06T22:21:26.0458994Z adding: test/benchmark_utils/callgrind_artifacts.json (deflated 92%) 2024-08-06T22:21:26.0459479Z adding: test/minioptest_failures_dict.json (deflated 70%) 2024-08-06T22:21:26.0465890Z adding: test/profiler/profiler_utils_mock_events.json (deflated 87%) 2024-08-06T22:21:26.0469019Z adding: test/test-reports/td_exclusions-c532b7f50607ee95edb9.json (deflated 81%) 2024-08-06T22:21:26.0469709Z adding: test/test-reports/td_exclusions-61bf590f88f93fc2b8f7.json (deflated 38%) 2024-08-06T22:21:26.0470361Z adding: test/test-reports/td_exclusions-69a370009091b4819966.json (deflated 18%) 2024-08-06T22:21:26.0470853Z adding: test/.pytorch-slow-tests.json (deflated 66%) 2024-08-06T22:21:26.0483518Z adding: test/.pytorch-disabled-tests.json (deflated 89%) 2024-08-06T22:21:26.0521366Z ##[group]Run # Remove any previous test reports if they exist 2024-08-06T22:21:26.0521799Z # Remove any previous test reports if they exist 2024-08-06T22:21:26.0522141Z rm -f test-reports-*.zip 2024-08-06T22:21:26.0522533Z zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml' -i '*.csv' 2024-08-06T22:21:26.0531162Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:26.0531515Z env: 2024-08-06T22:21:26.0531728Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:26.0532036Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:26.0532543Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:26.0533137Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080 2024-08-06T22:21:26.0533576Z ##[endgroup] 2024-08-06T22:21:26.0814990Z adding: test/custom_backend/test-reports/python-unittest/test_custom_backend/TEST-TestCustomBackend-20240806222113.xml (deflated 56%) 2024-08-06T22:21:26.0816022Z adding: test/custom_operator/test-reports/python-unittest/test_custom_ops/TEST-TestCustomOperators-20240806222106.xml (deflated 73%) 2024-08-06T22:21:26.0816946Z adding: test/test-reports/python-pytest/test_nestedtensor/test_nestedtensor-a15ab28f2c5cf2fd.xml (deflated 28%) 2024-08-06T22:21:26.0843259Z adding: test/test-reports/python-pytest/test_nestedtensor/test_nestedtensor-996933be75a161bc.xml (deflated 95%) 2024-08-06T22:21:26.0844047Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-dde7f24a1e4b8ddd.xml (deflated 28%) 2024-08-06T22:21:26.0844773Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-6833244398b9a790.xml (deflated 28%) 2024-08-06T22:21:26.0845483Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-f81a19ec9047446a.xml (deflated 28%) 2024-08-06T22:21:26.0846278Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-61c5f75e9793c036.xml (deflated 28%) 2024-08-06T22:21:26.0846994Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-fb55b4ee81767bc4.xml (deflated 28%) 2024-08-06T22:21:26.0847709Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-a73fe7a9adcb9059.xml (deflated 28%) 2024-08-06T22:21:26.0852299Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-c1b5a1cdbe2e4333.xml (deflated 91%) 2024-08-06T22:21:26.0859238Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-55034f69ab832630.xml (deflated 91%) 2024-08-06T22:21:26.0864734Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-d6eef79abdfcaa00.xml (deflated 91%) 2024-08-06T22:21:26.0871824Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-bae5fb9212369f90.xml (deflated 91%) 2024-08-06T22:21:26.0877998Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-85f72d5597580089.xml (deflated 91%) 2024-08-06T22:21:26.0884832Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-ec8b9fcb4e21174c.xml (deflated 91%) 2024-08-06T22:21:26.0885747Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-efbd75590a4bc435.xml (deflated 28%) 2024-08-06T22:21:26.0889585Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-fdd8fa94916990a4.xml (deflated 93%) 2024-08-06T22:21:26.0890558Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-aea5e16907f57e75.xml (deflated 28%) 2024-08-06T22:21:26.0891388Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-5ab53533948bf7e1.xml (deflated 28%) 2024-08-06T22:21:26.0892453Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-31c8cb430a31bd8b.xml (deflated 28%) 2024-08-06T22:21:26.0922435Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-b162a8e48ed5f4af.xml (deflated 93%) 2024-08-06T22:21:26.0954050Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-a55bb1b81c4c4d1c.xml (deflated 93%) 2024-08-06T22:21:26.0985754Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-fd529c0511a5c49d.xml (deflated 93%) 2024-08-06T22:21:26.0986897Z adding: test/test-reports/python-pytest/test_ops/test_ops-188ad6b879c1025a.xml (deflated 28%) 2024-08-06T22:21:26.0987575Z adding: test/test-reports/python-pytest/test_ops/test_ops-2780d0f310bde00d.xml (deflated 28%) 2024-08-06T22:21:26.0988257Z adding: test/test-reports/python-pytest/test_ops/test_ops-1fe383e88f7bddac.xml (deflated 28%) 2024-08-06T22:21:26.1110636Z adding: test/test-reports/python-pytest/test_ops/test_ops-2dcab3e055573bc4.xml (deflated 97%) 2024-08-06T22:21:26.1231607Z adding: test/test-reports/python-pytest/test_ops/test_ops-ebef25ec10a6c3e9.xml (deflated 97%) 2024-08-06T22:21:26.1346993Z adding: test/test-reports/python-pytest/test_ops/test_ops-a77be1e37f06d4dd.xml (deflated 97%) 2024-08-06T22:21:26.1348000Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-cadc6e892494bd6e.xml (deflated 28%) 2024-08-06T22:21:26.1349173Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-1a75707580a23167.xml (deflated 28%) 2024-08-06T22:21:26.1350335Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-1543d942fcb016b6.xml (deflated 28%) 2024-08-06T22:21:26.1351710Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-e514922bde3007a9.xml (deflated 89%) 2024-08-06T22:21:26.1352885Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-35cbc1aa0a45a81b.xml (deflated 91%) 2024-08-06T22:21:26.1355454Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-2b88ebc57e92266a.xml (deflated 88%) 2024-08-06T22:21:26.1358035Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-a62140c89544360f.xml (deflated 89%) 2024-08-06T22:21:26.1358993Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-cc0f66fcca588979.xml (deflated 36%) 2024-08-06T22:21:26.1359974Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-7ce886a009114a1b.xml (deflated 28%) 2024-08-06T22:21:26.1360941Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-d858e15221cdc00b.xml (deflated 91%) 2024-08-06T22:21:26.1361888Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-a82215d487f8633b.xml (deflated 45%) 2024-08-06T22:21:26.1362906Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-24481963e654c7d1.xml (deflated 28%) 2024-08-06T22:21:26.1363983Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-4ef757c810b197e4.xml (deflated 28%) 2024-08-06T22:21:26.1365168Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-d351918228d00cbd.xml (deflated 60%) 2024-08-06T22:21:26.1366355Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-a468123d14292c06.xml (deflated 28%) 2024-08-06T22:21:26.1373410Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-f30e8cdd1e46e2c9.xml (deflated 92%) 2024-08-06T22:21:26.1382826Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-58b8cb49731929ba.xml (deflated 93%) 2024-08-06T22:21:26.1390950Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-d10693f9d992a8b3.xml (deflated 93%) 2024-08-06T22:21:26.1392444Z adding: test/test-reports/python-pytest/inductor.test_cuda_cpp_wrapper/inductor.test_cuda_cpp_wrapper-580de07509b6e681.xml (deflated 28%) 2024-08-06T22:21:26.1393474Z adding: test/test-reports/python-pytest/inductor.test_cuda_cpp_wrapper/inductor.test_cuda_cpp_wrapper-45e161f052e17df8.xml (deflated 83%) 2024-08-06T22:21:26.1394492Z adding: test/test-reports/python-pytest/inductor.test_inductor_freezing/inductor.test_inductor_freezing-903c5e02fae95ed8.xml (deflated 28%) 2024-08-06T22:21:26.1396854Z adding: test/test-reports/python-pytest/inductor.test_inductor_freezing/inductor.test_inductor_freezing-680850c36e262ccd.xml (deflated 90%) 2024-08-06T22:21:26.1397820Z adding: test/test-reports/python-pytest/inductor.test_fx_fusion/inductor.test_fx_fusion-fcba5b1402458a64.xml (deflated 28%) 2024-08-06T22:21:26.1398714Z adding: test/test-reports/python-pytest/inductor.test_fx_fusion/inductor.test_fx_fusion-9b1a68c415365ae9.xml (deflated 63%) 2024-08-06T22:21:26.1399582Z adding: test/test-reports/python-pytest/torch_np.test_dtype/torch_np.test_dtype-2a169b6fd93e3800.xml (deflated 28%) 2024-08-06T22:21:26.1400414Z adding: test/test-reports/python-pytest/torch_np.test_dtype/torch_np.test_dtype-488c3e8a6e128543.xml (deflated 94%) 2024-08-06T22:21:26.1401180Z adding: test/test-reports/python-pytest/test_deploy/test_deploy-30aa6b8519d005c2.xml (deflated 28%) 2024-08-06T22:21:26.1402053Z adding: test/test-reports/python-pytest/test_deploy/test_deploy-17b5a2fc1ba1ac71.xml (deflated 37%) 2024-08-06T22:21:26.1402880Z adding: test/test-reports/python-pytest/profiler.test_cpp_thread/profiler.test_cpp_thread-649d6ec4640a3614.xml (deflated 28%) 2024-08-06T22:21:26.1403791Z adding: test/test-reports/python-pytest/profiler.test_cpp_thread/profiler.test_cpp_thread-7d725202cc2822e0.xml (deflated 83%) 2024-08-06T22:21:26.1404765Z adding: test/test-reports/python-pytest/dynamo.test_exceptions/dynamo.test_exceptions-c93bbf5a040ecdfc.xml (deflated 28%) 2024-08-06T22:21:26.1405658Z adding: test/test-reports/python-pytest/dynamo.test_exceptions/dynamo.test_exceptions-e3a001a94c803402.xml (deflated 80%) 2024-08-06T22:21:26.1406527Z adding: test/test-reports/python-pytest/test_fx_experimental/test_fx_experimental-99c8450d462d4d27.xml (deflated 28%) 2024-08-06T22:21:26.1419517Z adding: test/test-reports/python-pytest/test_fx_experimental/test_fx_experimental-673c081bacecf650.xml (deflated 97%) 2024-08-06T22:21:26.1420331Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-1c20af093c6dce74.xml (deflated 29%) 2024-08-06T22:21:26.1428492Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-ccce35c7353b9a78.xml (deflated 86%) 2024-08-06T22:21:26.1429324Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-29ee2191a9be390a.xml (deflated 29%) 2024-08-06T22:21:26.1430051Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-d7cd70354a2e0823.xml (deflated 58%) 2024-08-06T22:21:26.1467738Z ##[group]Run # Remove any previous usage logs if they exist 2024-08-06T22:21:26.1468163Z # Remove any previous usage logs if they exist 2024-08-06T22:21:26.1468494Z rm -f logs-*.zip 2024-08-06T22:21:26.1468900Z # this workflow is also run in bazel build test, but we dont generate usage reports for it 2024-08-06T22:21:26.1469379Z # so check to see if the file exists first 2024-08-06T22:21:26.1469696Z if [ -f 'usage_log.txt' ]; then 2024-08-06T22:21:26.1470037Z  zip "logs-${FILE_SUFFIX}.zip" 'usage_log.txt' 2024-08-06T22:21:26.1470345Z fi 2024-08-06T22:21:26.1470579Z if ls test/**/*.log 1> /dev/null 2>&1; then 2024-08-06T22:21:26.1470937Z  zip -r "logs-${FILE_SUFFIX}.zip" test -i '*.log' 2024-08-06T22:21:26.1471254Z fi 2024-08-06T22:21:26.1479645Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:26.1479983Z env: 2024-08-06T22:21:26.1480184Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:26.1480490Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:26.1480978Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:26.1481577Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080 2024-08-06T22:21:26.1482001Z ##[endgroup] 2024-08-06T22:21:26.1576004Z adding: usage_log.txt (deflated 92%) 2024-08-06T22:21:26.1876130Z adding: test/test-reports/test_nestedtensor_1.1_c7c818116737d768_.log (deflated 50%) 2024-08-06T22:21:26.1877272Z adding: test/test-reports/test_decomp_4.22_1c993cdc6879f82f_.log (deflated 48%) 2024-08-06T22:21:26.1878284Z adding: test/test-reports/test_decomp_6.22_723b1d41e7b06743_.log (deflated 48%) 2024-08-06T22:21:26.1879274Z adding: test/test-reports/test_decomp_7.22_0225f82664358233_.log (deflated 48%) 2024-08-06T22:21:26.1880288Z adding: test/test-reports/test_decomp_17.22_04aaac7bcf595a12_.log (deflated 48%) 2024-08-06T22:21:26.1881316Z adding: test/test-reports/test_decomp_21.22_a8b38bfe130b23b0_.log (deflated 48%) 2024-08-06T22:21:26.1882312Z adding: test/test-reports/test_decomp_22.22_5f62108bfb875684_.log (deflated 48%) 2024-08-06T22:21:26.1883487Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.16_bc0f63e78032244d_.log (deflated 52%) 2024-08-06T22:21:26.1884713Z adding: test/test-reports/functorch.test_ops_3.7_3bf65b2a552bdb0d_.log (deflated 50%) 2024-08-06T22:21:26.1885818Z adding: test/test-reports/functorch.test_ops_4.7_f1708022e7ee2246_.log (deflated 49%) 2024-08-06T22:21:26.1887053Z adding: test/test-reports/functorch.test_ops_7.7_6e93807924724224_.log (deflated 50%) 2024-08-06T22:21:26.1887621Z adding: test/test-reports/test_ops_2.8_44a8717aa20bbc2b_.log (deflated 49%) 2024-08-06T22:21:26.1888130Z adding: test/test-reports/test_ops_4.8_f4354431713e9870_.log (deflated 49%) 2024-08-06T22:21:26.1888718Z adding: test/test-reports/test_ops_6.8_92b29bad7748e831_.log (deflated 49%) 2024-08-06T22:21:26.1889307Z adding: test/test-reports/inductor.test_aot_inductor_7.16_eb833b86d3a09c09_.log (deflated 51%) 2024-08-06T22:21:26.1889962Z adding: test/test-reports/inductor.test_aot_inductor_9.16_7c48c5cae5e1f9ce_.log (deflated 51%) 2024-08-06T22:21:26.1890626Z adding: test/test-reports/inductor.test_aot_inductor_11.16_baacaa68a282cd62_.log (deflated 51%) 2024-08-06T22:21:26.1891350Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_2.6_e796776bea6851ef_.log (deflated 53%) 2024-08-06T22:21:26.1892402Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_3.6_1cad0782905fa8f3_.log (deflated 59%) 2024-08-06T22:21:26.1893244Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_4.6_944c9e056a284ff9_.log (deflated 53%) 2024-08-06T22:21:26.1894257Z adding: test/test-reports/inductor.test_cuda_cpp_wrapper_4.5_d155ac4deb1da02c_.log (deflated 51%) 2024-08-06T22:21:26.1894964Z adding: test/test-reports/inductor.test_inductor_freezing_1.1_f328c1a4beea5017_.log (deflated 52%) 2024-08-06T22:21:26.1895617Z adding: test/test-reports/inductor.test_fx_fusion_1.1_658a5163b038c898_.log (deflated 50%) 2024-08-06T22:21:26.1896225Z adding: test/test-reports/torch_np.test_dtype_1.1_15859ede88a296a5_.log (deflated 50%) 2024-08-06T22:21:26.1896809Z adding: test/test-reports/test_deploy_1.1_eb83ca26a22c584b_.log (deflated 49%) 2024-08-06T22:21:26.1897409Z adding: test/test-reports/profiler.test_cpp_thread_1.1_2e29db9a2540ef70_.log (deflated 62%) 2024-08-06T22:21:26.1898092Z adding: test/test-reports/dynamo.test_exceptions_1.1_9ec46a182d72911c_.log (deflated 50%) 2024-08-06T22:21:26.1898760Z adding: test/test-reports/test_fx_experimental_1.1_c0ea3f9d3b2f79ce_.log (deflated 50%) 2024-08-06T22:21:26.1929694Z adding: test/test-reports/test_nestedtensor_1.1_dc5f1dd81e9c07b0_.log (deflated 95%) 2024-08-06T22:21:26.1942124Z adding: test/test-reports/test_decomp_6.22_7c53a7588da0f0bd_.log (deflated 89%) 2024-08-06T22:21:26.1953360Z adding: test/test-reports/test_decomp_4.22_0cab0670d711450b_.log (deflated 89%) 2024-08-06T22:21:26.1962260Z adding: test/test-reports/test_decomp_7.22_2d1d0e835d58ce77_.log (deflated 89%) 2024-08-06T22:21:26.1973733Z adding: test/test-reports/test_decomp_21.22_52c3d44543a298c3_.log (deflated 89%) 2024-08-06T22:21:26.1984110Z adding: test/test-reports/test_decomp_17.22_7d7f397af1933552_.log (deflated 89%) 2024-08-06T22:21:26.1991263Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.16_48a5e37732f88614_.log (deflated 91%) 2024-08-06T22:21:26.2002667Z adding: test/test-reports/test_decomp_22.22_018afb577caad2e6_.log (deflated 89%) 2024-08-06T22:21:26.2040668Z adding: test/test-reports/functorch.test_ops_3.7_a7de25d0d05076bd_.log (deflated 92%) 2024-08-06T22:21:26.2078970Z adding: test/test-reports/functorch.test_ops_4.7_68743ccfbd995ca9_.log (deflated 92%) 2024-08-06T22:21:26.2179002Z adding: test/test-reports/test_ops_2.8_f0a7bc2de991ae35_.log (deflated 92%) 2024-08-06T22:21:26.2215190Z adding: test/test-reports/functorch.test_ops_7.7_30bacd547c101322_.log (deflated 92%) 2024-08-06T22:21:26.2315216Z adding: test/test-reports/test_ops_4.8_6e12e65be181d4cf_.log (deflated 92%) 2024-08-06T22:21:26.2415084Z adding: test/test-reports/test_ops_6.8_a9d9dc6c489cdbf0_.log (deflated 92%) 2024-08-06T22:21:26.2418044Z adding: test/test-reports/inductor.test_aot_inductor_11.16_e9a072ac5423a2ba_.log (deflated 88%) 2024-08-06T22:21:26.2430765Z adding: test/test-reports/inductor.test_aot_inductor_7.16_e50ead3b82712b34_.log (deflated 92%) 2024-08-06T22:21:26.2441393Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_2.6_bb4b20634ee035ca_.log (deflated 90%) 2024-08-06T22:21:26.2446912Z adding: test/test-reports/inductor.test_aot_inductor_9.16_37c7619e1459b5df_.log (deflated 91%) 2024-08-06T22:21:26.2455420Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_3.6_04d78c180807a766_.log (deflated 90%) 2024-08-06T22:21:26.2463246Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_4.6_fe53037062bc5225_.log (deflated 91%) 2024-08-06T22:21:26.2463957Z adding: test/test-reports/inductor.test_fx_fusion_1.1_dbaca286ee0bc292_.log (deflated 63%) 2024-08-06T22:21:26.2464909Z adding: test/test-reports/torch_np.test_dtype_1.1_e4079a06cfe371cf_.log (deflated 88%) 2024-08-06T22:21:26.2465484Z adding: test/test-reports/test_deploy_1.1_3d3204ec08a4930d_.log (deflated 50%) 2024-08-06T22:21:26.2466793Z adding: test/test-reports/profiler.test_cpp_thread_1.1_e137648a79de89f6_.log (deflated 82%) 2024-08-06T22:21:26.2468713Z adding: test/test-reports/inductor.test_inductor_freezing_1.1_af5283ef1d76b32e_.log (deflated 86%) 2024-08-06T22:21:26.2469378Z adding: test/test-reports/dynamo.test_exceptions_1.1_ff88573c60c242ad_.log (deflated 76%) 2024-08-06T22:21:26.2486539Z adding: test/test-reports/test_fx_experimental_1.1_38d30b9876216252_.log (deflated 95%) 2024-08-06T22:21:26.2487636Z adding: test/test-reports/inductor.test_cuda_cpp_wrapper_4.5_f865a2da5bca2dfd_.log (deflated 82%) 2024-08-06T22:21:26.2488254Z adding: test/test-reports/cpp.test_jit_1.1_a9c1e86c9f5c6bc3_.log (deflated 48%) 2024-08-06T22:21:26.2508950Z adding: test/test-reports/cpp.test_jit_1.1_ef7dc05941bfe9ce_.log (deflated 93%) 2024-08-06T22:21:26.2509528Z adding: test/test-reports/cpp.test_mobile_nnc_1.1_ec2cd48ecc749370_.log (deflated 48%) 2024-08-06T22:21:26.2510117Z adding: test/test-reports/cpp.test_mobile_nnc_1.1_d632303c92be5540_.log (deflated 75%) 2024-08-06T22:21:26.2549215Z ##[group]Run # Remove any previous debugging artifacts if they exist 2024-08-06T22:21:26.2549753Z # Remove any previous debugging artifacts if they exist 2024-08-06T22:21:26.2550106Z rm -f debug-*.zip 2024-08-06T22:21:26.2550371Z if [ -d 'test/debug' ]; then 2024-08-06T22:21:26.2550687Z  zip -r "debug-${FILE_SUFFIX}.zip" test/debug 2024-08-06T22:21:26.2550995Z fi 2024-08-06T22:21:26.2559332Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:26.2559667Z env: 2024-08-06T22:21:26.2559849Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:26.2560148Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:26.2560637Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:26.2561224Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080 2024-08-06T22:21:26.2561638Z ##[endgroup] 2024-08-06T22:21:26.2649461Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-06T22:21:26.2649758Z with: 2024-08-06T22:21:26.2649965Z s3-bucket: gha-artifacts 2024-08-06T22:21:26.2650251Z s3-prefix: pytorch/pytorch/10273124344/1/artifact 2024-08-06T22:21:26.2650572Z retention-days: 14 2024-08-06T22:21:26.2650798Z if-no-files-found: warn 2024-08-06T22:21:26.2651045Z path: test-jsons-*.zip 2024-08-06T22:21:26.2651274Z name: artifact 2024-08-06T22:21:26.2651480Z region: us-east-1 2024-08-06T22:21:26.2651684Z env: 2024-08-06T22:21:26.2651879Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:26.2652172Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:26.2652665Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:26.2653107Z ##[endgroup] 2024-08-06T22:21:26.6389717Z NOTE: s3-prefix specified, ignoring name parameter 2024-08-06T22:21:26.6390278Z With the provided path, there will be 1 file uploaded 2024-08-06T22:21:26.6390701Z Uploading to s3 prefix: pytorch/pytorch/10273124344/1/artifact 2024-08-06T22:21:26.6448165Z Starting upload of test-jsons-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080.zip 2024-08-06T22:21:26.7752155Z Finished upload of test-jsons-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080.zip 2024-08-06T22:21:26.8032662Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-06T22:21:26.8032967Z with: 2024-08-06T22:21:26.8033160Z s3-bucket: gha-artifacts 2024-08-06T22:21:26.8033560Z s3-prefix: pytorch/pytorch/10273124344/1/artifact 2024-08-06T22:21:26.8033874Z retention-days: 14 2024-08-06T22:21:26.8034169Z if-no-files-found: error 2024-08-06T22:21:26.8034423Z path: test-reports-*.zip 2024-08-06T22:21:26.8034661Z name: artifact 2024-08-06T22:21:26.8034862Z region: us-east-1 2024-08-06T22:21:26.8035070Z env: 2024-08-06T22:21:26.8035267Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:26.8035562Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:26.8050749Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:26.8051195Z ##[endgroup] 2024-08-06T22:21:27.1553747Z NOTE: s3-prefix specified, ignoring name parameter 2024-08-06T22:21:27.1554162Z With the provided path, there will be 1 file uploaded 2024-08-06T22:21:27.1554747Z Uploading to s3 prefix: pytorch/pytorch/10273124344/1/artifact 2024-08-06T22:21:27.1610680Z Starting upload of test-reports-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080.zip 2024-08-06T22:21:27.4022271Z Finished upload of test-reports-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080.zip 2024-08-06T22:21:27.4303105Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-06T22:21:27.4303411Z with: 2024-08-06T22:21:27.4303617Z s3-bucket: gha-artifacts 2024-08-06T22:21:27.4303900Z s3-prefix: pytorch/pytorch/10273124344/1/artifact 2024-08-06T22:21:27.4304214Z retention-days: 14 2024-08-06T22:21:27.4304455Z if-no-files-found: ignore 2024-08-06T22:21:27.4304701Z path: logs-*.zip 2024-08-06T22:21:27.4304914Z name: artifact 2024-08-06T22:21:27.4305118Z region: us-east-1 2024-08-06T22:21:27.4305340Z env: 2024-08-06T22:21:27.4305538Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:27.4305836Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:27.4306340Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:27.4306786Z ##[endgroup] 2024-08-06T22:21:27.7852748Z NOTE: s3-prefix specified, ignoring name parameter 2024-08-06T22:21:27.7853524Z With the provided path, there will be 1 file uploaded 2024-08-06T22:21:27.7854480Z Uploading to s3 prefix: pytorch/pytorch/10273124344/1/artifact 2024-08-06T22:21:27.7909495Z Starting upload of logs-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080.zip 2024-08-06T22:21:27.9384820Z Finished upload of logs-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28428649080.zip 2024-08-06T22:21:27.9663523Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-06T22:21:27.9663818Z with: 2024-08-06T22:21:27.9664016Z s3-bucket: gha-artifacts 2024-08-06T22:21:27.9664318Z s3-prefix: pytorch/pytorch/10273124344/1/artifact 2024-08-06T22:21:27.9664623Z retention-days: 14 2024-08-06T22:21:27.9664855Z if-no-files-found: ignore 2024-08-06T22:21:27.9665108Z path: debug-*.zip 2024-08-06T22:21:27.9665311Z name: artifact 2024-08-06T22:21:27.9665518Z region: us-east-1 2024-08-06T22:21:27.9665719Z env: 2024-08-06T22:21:27.9665909Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:27.9666210Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:27.9666700Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:27.9667126Z ##[endgroup] 2024-08-06T22:21:28.3110133Z No files were found with the provided path: debug-*.zip. No artifacts will be uploaded. 2024-08-06T22:21:28.3389186Z ##[group]Run # shellcheck disable=SC2156 2024-08-06T22:21:28.3389520Z # shellcheck disable=SC2156 2024-08-06T22:21:28.3390034Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2024-08-06T22:21:28.3399099Z shell: /usr/bin/bash -e {0} 2024-08-06T22:21:28.3399340Z env: 2024-08-06T22:21:28.3399549Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:28.3399850Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:28.3400337Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:28.3400877Z ##[endgroup] 2024-08-06T22:21:28.6018241Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2024-08-06T22:21:28.6018626Z with: 2024-08-06T22:21:28.6018795Z env: 2024-08-06T22:21:28.6018980Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:28.6019278Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:28.6019760Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:28.6020196Z ##[endgroup] 2024-08-06T22:21:28.6043431Z ##[group]Run set -eou pipefail 2024-08-06T22:21:28.6043722Z set -eou pipefail 2024-08-06T22:21:28.6043949Z  2024-08-06T22:21:28.6044272Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2024-08-06T22:21:28.6044782Z for _ in $(seq 1440); do 2024-08-06T22:21:28.6045070Z  # Break if no ssh session exists anymore 2024-08-06T22:21:28.6045379Z  if [ "$(who)" = "" ]; then 2024-08-06T22:21:28.6045653Z  break 2024-08-06T22:21:28.6045875Z  fi 2024-08-06T22:21:28.6046066Z  echo "." 2024-08-06T22:21:28.6046277Z  sleep 5 2024-08-06T22:21:28.6046482Z done 2024-08-06T22:21:28.6055069Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:28.6055401Z env: 2024-08-06T22:21:28.6055598Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:28.6055883Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:28.6056377Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:28.6056811Z ##[endgroup] 2024-08-06T22:21:28.6084811Z Holding runner for 2 hours until all ssh sessions have logged out 2024-08-06T22:21:28.6147726Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2024-08-06T22:21:28.6148227Z # ignore expansion of "docker ps -q" since it could be empty 2024-08-06T22:21:28.6148596Z # shellcheck disable=SC2046 2024-08-06T22:21:28.6148907Z docker stop $(docker ps -q) || true 2024-08-06T22:21:28.6149212Z # Prune all of the docker images 2024-08-06T22:21:28.6149493Z docker system prune -af 2024-08-06T22:21:28.6157974Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:28.6158305Z env: 2024-08-06T22:21:28.6158497Z GIT_DEFAULT_BRANCH: main 2024-08-06T22:21:28.6158795Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-06T22:21:28.6159283Z DOCKER_CONTAINER_ID: 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:28.6159717Z ##[endgroup] 2024-08-06T22:21:29.2385105Z 88943b22963e 2024-08-06T22:21:31.1594281Z Deleted Containers: 2024-08-06T22:21:31.1594699Z 88943b22963ea10ef0311f3f36ed71376832e46cba7c57d5d6ba0df4e2579203 2024-08-06T22:21:31.1595079Z 2024-08-06T22:21:39.6069550Z Deleted Images: 2024-08-06T22:21:39.6070350Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-06T22:21:39.6071605Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9@sha256:00f47b036f588ca5ef8866f8635fabba5a95cdf9ff1adae7d2a674ef1d4076e9 2024-08-06T22:21:39.6072578Z deleted: sha256:6ec36276acd88c9be8b44d856744037d399b35f4bb1703e637c27ae2b254c901 2024-08-06T22:21:39.6073165Z deleted: sha256:6fbc5fe2ebb0dc33846ab2ade7c5296a0a521e16f71c3b15b8a0c40a8fce5ed3 2024-08-06T22:21:39.6073753Z deleted: sha256:31fb5fadd0be2cc6e0d198fbc00e4f3c25925bc9c9ce79be23f02aeb56f6a55a 2024-08-06T22:21:39.6074341Z deleted: sha256:d1d89c6e648d792c08fdaae4fdad1273f93571a1b8d03c76f38f9e2be7cfe7f1 2024-08-06T22:21:39.6074921Z deleted: sha256:0f1adf9d1a1d4eeb62caa063c1090ea2f50246ed031d8d6627df2f5fe5963067 2024-08-06T22:21:39.6075496Z deleted: sha256:771e69c2306df070c5a944ea3043f81ce5890ce90de66fb6f16edfaf09912b35 2024-08-06T22:21:39.6076061Z deleted: sha256:288aa01dcd110ad9e2e3f48756d987529733a41225b6e8d4898f7386666beeae 2024-08-06T22:21:39.6077145Z deleted: sha256:d9d7c6c9bef79d8c5ce30d89b81820f3664c04fe17ae396b2a121e96fcfdeccf 2024-08-06T22:21:39.6077727Z deleted: sha256:1efa95271d86fd98beb626e63f4c67732daca36bf6e25d864735e43ef4c708f4 2024-08-06T22:21:39.6078290Z deleted: sha256:b1699f7cf1967593f94664569bb49ba431deaac0d1ceaf0d0584046a78ca3be0 2024-08-06T22:21:39.6078858Z deleted: sha256:b70eb1cb717d1c2ce90d5fa7f48e1af32d45f3735b33e1f944762686d8459aa6 2024-08-06T22:21:39.6079421Z deleted: sha256:6460a71f50d3c4bf7773fa072665e881e2a7e0e4e0bd45302e6a73c35ec03898 2024-08-06T22:21:39.6079971Z deleted: sha256:1c1787b4980844f6bfea1fe2386125ac286e9dfd80e253164a7f82b90f9c37bc 2024-08-06T22:21:39.6080537Z deleted: sha256:758fb95d0e1c4b4c78ee59fc4d66c06acfbc3a3ffa870e94200529367b8998e8 2024-08-06T22:21:39.6081220Z deleted: sha256:99beb19b8911fc7c49d248960a97d8ba4bbcbe8afc9fe3142e5af37f08c1c821 2024-08-06T22:21:39.6081787Z deleted: sha256:55596f5748c7f4d8e4d9fea0d1a6cda4627ccc442e652c8578169efe5992a382 2024-08-06T22:21:39.6082368Z deleted: sha256:c279b10b40b22a2fba8aeaad85e7705ffb1afa28e000323b35cc947649b7637d 2024-08-06T22:21:39.6082946Z deleted: sha256:8c882a49501c1d973eacd2ed8da81bce818403153d3fb4cdc84baf14307f9517 2024-08-06T22:21:39.6083506Z deleted: sha256:9fdbb157fcea7486d127e106ff114401c3102b9e9ed879a59247a296b3a908af 2024-08-06T22:21:39.6084094Z deleted: sha256:25b821c2e5d4106b0214b3aa4c88265bda5f90f045f012af912f0a3d2979f919 2024-08-06T22:21:39.6084809Z deleted: sha256:fada342ffd1729d0c38b7afe6767d6840580548d8dac7c62ddf61e24803ad66e 2024-08-06T22:21:39.6085371Z deleted: sha256:14d55c394ec592bb1417a21f696089e0549403bff381cda63d3e92ec80bca298 2024-08-06T22:21:39.6085935Z deleted: sha256:1032dedb70a12fc3c1cae4b10cdd7a07a9d20a64736533d640dd133d46f1ebfe 2024-08-06T22:21:39.6086501Z deleted: sha256:c0f5baad08f078d4210c254240524d9063c5e797564ada8a4bf39270e0fc1300 2024-08-06T22:21:39.6087065Z deleted: sha256:1a8bc02cd2c8f897e1dfd14920703ea75d172660c85026068b39136a1a25db51 2024-08-06T22:21:39.6087615Z deleted: sha256:b2221232f25b8d3a916bf8f74248542af7830e6198814ffbdbab43484bcb700b 2024-08-06T22:21:39.6088184Z deleted: sha256:14dc7064caf62d3d51b65e5175f9ae0e31af4fbed83413736f80d881f0fd742e 2024-08-06T22:21:39.6088750Z deleted: sha256:162ef080ee355a240e7e8fd6761113ad61586e1c883207ba4290c90abf208b54 2024-08-06T22:21:39.6089304Z deleted: sha256:56e3e897838040eb3fb86255ef1be2e3be6705d2e4a1ead67bbd400350ff6d13 2024-08-06T22:21:39.6089865Z deleted: sha256:a4f1392c3713fac843566ae891b9e41c28824e377af84190d33a4132cd4b268d 2024-08-06T22:21:39.6090424Z deleted: sha256:d1a8ab1bb8f3238d77db318f5161058057f84f7357afd2932e1d7edded9a2efb 2024-08-06T22:21:39.6091072Z deleted: sha256:7cab46d23f9fd91e62280b25d274d6c8a03be6d66030adf598944913d280c54f 2024-08-06T22:21:39.6091636Z deleted: sha256:764da7f97ae4318f5157e363c03c86c016b8d1b6d2c4758d203fa8f194753f57 2024-08-06T22:21:39.6092532Z deleted: sha256:373ab2f02049fb6ccab3f886c53934457751a14e5cb68b5e144b04bb7afeed87 2024-08-06T22:21:39.6093102Z deleted: sha256:e46d90cf4cc84661ac93ae9bcfbc675c0fab0582328e4595d075e4f259c4389a 2024-08-06T22:21:39.6093671Z deleted: sha256:1579b987297253e32b13fe0d160a95c567910203783be2efcc3a76650255e658 2024-08-06T22:21:39.6094462Z deleted: sha256:f8542342382d7443bc8be95e90ac743f70abdafcd9987e3785f58ad50509a145 2024-08-06T22:21:39.6095020Z deleted: sha256:57432242bf8090a0c1238a5f004be4da246f4488a2193be182e5cd82237509e4 2024-08-06T22:21:39.6095575Z deleted: sha256:e3c369d15f59fb5f34178d8dcb6bb4a6ff1d527e9baf3182444c24c55600bdf4 2024-08-06T22:21:39.6096185Z deleted: sha256:4215189c6756d59c8a217cc09447c312318d07c0b0d5c9bfb7e4bfb942f05cae 2024-08-06T22:21:39.6096729Z deleted: sha256:b787e6329788262404e4b8293a110727833b63e2e287ccd569a21cb3fb450388 2024-08-06T22:21:39.6097285Z deleted: sha256:69bb5dca4b9fded8d0c731b73bcbed0c3e9ce170c89e79796956a444b0d58c4c 2024-08-06T22:21:39.6097865Z deleted: sha256:8c945f56e57fb4fbfa3f2c74d6109a47db8df7644c4988d0c5791e4214bf30c9 2024-08-06T22:21:39.6098431Z deleted: sha256:fdd079e0e07e11d86ffc744f1036793f3db2aea378660f7489fbf50002439620 2024-08-06T22:21:39.6099241Z deleted: sha256:6a743a461982e1a1b73f29cd187e354f2119f3bd985e1da6ca0b802134cb91ba 2024-08-06T22:21:39.6099802Z deleted: sha256:7b3cbce1a91c69d6a499ef524311c14537930fc04f0ef4b1d73030216ac4a568 2024-08-06T22:21:39.6100353Z deleted: sha256:d7120733e426141c8a9e8f2c3596b12055cd5b1956d141ad640365cb11628a00 2024-08-06T22:21:39.6100899Z deleted: sha256:841ae666d028560ea37e6314ae5f80d72fb071fe7592b577dbbd8156c08cdda2 2024-08-06T22:21:39.6101468Z deleted: sha256:9d85a8c4ea0c0bc631a4d3e5472e8ed8e2312a4dc5a38f6e9890edeb37183186 2024-08-06T22:21:39.6102033Z deleted: sha256:606f6418680b5c53158271dfaca16c668f3b9dd25bb2b738f5621b6a56d08cdb 2024-08-06T22:21:39.6102603Z deleted: sha256:5ca4df537c3efc44dfde3bebb377ed490bc4d1cd5d6e5fa2fd9549dcb456c471 2024-08-06T22:21:39.6103181Z deleted: sha256:c6f3f6b7969e62b95fe926a1963e6628cb2f1b5388f0b12115564855e59423f1 2024-08-06T22:21:39.6103810Z deleted: sha256:ce178391f5f25e4ebe6ba8e84c84831fb31b8a9d4e82dbc270a73ddeccbf1c2c 2024-08-06T22:21:39.6104396Z deleted: sha256:d154e0a00edb13898d564f38c090096dfeb7a90beb0bd8addd57eca33f52151e 2024-08-06T22:21:39.6104958Z deleted: sha256:7226a8914ccef500b1d0327370cf9e3a66fbcb0b59a5f3b018981372152fc645 2024-08-06T22:21:39.6105521Z deleted: sha256:ce5fab2081603cd22d9f253d98828ddd60c0f8c44c5413d0615f4264f1cd6a7b 2024-08-06T22:21:39.6106100Z deleted: sha256:b3bc21b2cb6a9ca2b1dbbc412812581089c8ace8cc6b8d2b2767f0b3cbe8b99c 2024-08-06T22:21:39.6106660Z deleted: sha256:53e8f21eb1afc2c8550bef59d67810a5f79671d1d3d3924577f810321e66886b 2024-08-06T22:21:39.6107228Z deleted: sha256:f9770918a5cc46e8f6aa13da6903754fa89a6bcc9f6e11df1d61f9340e452cc8 2024-08-06T22:21:39.6107801Z deleted: sha256:bce2de65cb97e6a64f59fe7fc78644a2f3d62cf7769073cb59dcbb52009fc5d0 2024-08-06T22:21:39.6108366Z deleted: sha256:568d161c6888a8b0b09d7feb561f790a31d47c7ff7ca1c89c0ff4025f5b02e3a 2024-08-06T22:21:39.6108936Z deleted: sha256:2da22d4dfb814e7a3f4444f60464109149d35bc3e078e197f600bd9f6cbb9f6b 2024-08-06T22:21:39.6109499Z deleted: sha256:4af2daf4a1277af00005823106d88d704d5d0499eef553f01875c6f94380b2e3 2024-08-06T22:21:39.6110065Z deleted: sha256:e52d382a6cbfc1bb5122186d4b1401de79de002a9cc323ba5047dbf266f4d3a3 2024-08-06T22:21:39.6110628Z deleted: sha256:d10fe93dc5314d752e8882c5877457a6ae733a93a1e246cd6f79301de9325e9c 2024-08-06T22:21:39.6111192Z deleted: sha256:58d5ecb60ded3f99102b25d909a729386d92a44bf68b77ed3b49bf27d978e26a 2024-08-06T22:21:39.6111759Z deleted: sha256:dd522c9c6fdd98e9e4584578a66d9122b36ee8f856bba2140fcaaf908c7a68e8 2024-08-06T22:21:39.6112319Z deleted: sha256:3584fb50d1cbc80b57118953f1dc36ecb092b5986d56211f711b9161f403d66c 2024-08-06T22:21:39.6112881Z deleted: sha256:35a033279d5c9bd6b862bfa9331d8fcbf98bbaf778a778fabc882384db9204f8 2024-08-06T22:21:39.6113443Z deleted: sha256:ab9b00c496073d62e69e032178858d3d6d4c4bab9e87065d3785c6da57351d00 2024-08-06T22:21:39.6114012Z deleted: sha256:7ed947299e109c2459de6e240d86f049c30f93f2380659ce441d8737b8cd065f 2024-08-06T22:21:39.6114565Z deleted: sha256:410d5a7f7a9a1cb4551c106b9cc728a9dcff598f9be231a351ce6a0a33f81e64 2024-08-06T22:21:39.6115156Z deleted: sha256:5f0021bb56efa14bb93978c01513e2a1187ba30e69bdb0546d0ef39b30873f88 2024-08-06T22:21:39.6115727Z deleted: sha256:c1601aa97eb84151c14da3aeea351201bd99144d36d66b397f6555d80245d86d 2024-08-06T22:21:39.6116294Z deleted: sha256:72af05a89be22accdc1ca5d66dcbbb33993a9ef5997f849df1d4ba4c48049f25 2024-08-06T22:21:39.6116871Z deleted: sha256:38b90c9663dcaa2bc57a4dd3008298e7ea93e9535fa42312b4dc4246e7491af9 2024-08-06T22:21:39.6117448Z deleted: sha256:bda61b6cefb3ec8eeb74fef1ca1c7f9a5845fe5e8f07b8123323e897425d5c29 2024-08-06T22:21:39.6118021Z deleted: sha256:5a18e1aa877074529a84cbddf19f8d5403787823378ceae6b72fb62f78d43037 2024-08-06T22:21:39.6118579Z deleted: sha256:6c3e7df31590f02f10cb71fc4eb27653e9b428df2e6e5421a455b062bd2e39f9 2024-08-06T22:21:39.6118914Z 2024-08-06T22:21:39.6119061Z Total reclaimed space: 25.27GB 2024-08-06T22:21:39.6188750Z Post job cleanup. 2024-08-06T22:21:39.6237295Z Post job cleanup. 2024-08-06T22:21:39.7115452Z [command]/usr/bin/git version 2024-08-06T22:21:39.7168245Z git version 2.40.1 2024-08-06T22:21:39.7205374Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/73e7aaef-78cf-4349-af19-63cf760410f0' before making global git config changes 2024-08-06T22:21:39.7206225Z Adding repository directory to the temporary git global config as a safe directory 2024-08-06T22:21:39.7210445Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-06T22:21:39.7259289Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2024-08-06T22:21:39.7303706Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2024-08-06T22:21:39.7679121Z Entering 'android/libs/fbjni' 2024-08-06T22:21:39.7749221Z Entering 'third_party/FP16' 2024-08-06T22:21:39.7820979Z Entering 'third_party/FXdiv' 2024-08-06T22:21:39.7890797Z Entering 'third_party/NNPACK' 2024-08-06T22:21:39.7960157Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-06T22:21:39.8029545Z Entering 'third_party/XNNPACK' 2024-08-06T22:21:39.8113955Z Entering 'third_party/benchmark' 2024-08-06T22:21:39.8182877Z Entering 'third_party/cpp-httplib' 2024-08-06T22:21:39.8252055Z Entering 'third_party/cpuinfo' 2024-08-06T22:21:39.8321593Z Entering 'third_party/cudnn_frontend' 2024-08-06T22:21:39.8389404Z Entering 'third_party/cutlass' 2024-08-06T22:21:39.8469117Z Entering 'third_party/eigen' 2024-08-06T22:21:39.8542736Z Entering 'third_party/fbgemm' 2024-08-06T22:21:39.8612124Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-06T22:21:39.8677835Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T22:21:39.8744489Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-06T22:21:39.8818990Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-06T22:21:39.8884810Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T22:21:39.8955134Z Entering 'third_party/flatbuffers' 2024-08-06T22:21:39.9028301Z Entering 'third_party/fmt' 2024-08-06T22:21:39.9096645Z Entering 'third_party/foxi' 2024-08-06T22:21:39.9169375Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-06T22:21:39.9238595Z Entering 'third_party/gloo' 2024-08-06T22:21:39.9307717Z Entering 'third_party/googletest' 2024-08-06T22:21:39.9376521Z Entering 'third_party/ideep' 2024-08-06T22:21:39.9445714Z Entering 'third_party/ideep/mkl-dnn' 2024-08-06T22:21:39.9523006Z Entering 'third_party/ittapi' 2024-08-06T22:21:39.9593019Z Entering 'third_party/kineto' 2024-08-06T22:21:39.9660748Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T22:21:39.9725433Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T22:21:39.9794061Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T22:21:39.9862377Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T22:21:39.9930969Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T22:21:39.9997495Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T22:21:40.0068318Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T22:21:40.0136459Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T22:21:40.0203943Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T22:21:40.0272476Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T22:21:40.0343150Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T22:21:40.0412428Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T22:21:40.0484869Z Entering 'third_party/mimalloc' 2024-08-06T22:21:40.0555745Z Entering 'third_party/nccl/nccl' 2024-08-06T22:21:40.0624518Z Entering 'third_party/nlohmann' 2024-08-06T22:21:40.0694160Z Entering 'third_party/onnx' 2024-08-06T22:21:40.0776586Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-06T22:21:40.0844010Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-06T22:21:40.0919773Z Entering 'third_party/opentelemetry-cpp' 2024-08-06T22:21:40.0988213Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T22:21:40.1054740Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T22:21:40.1121530Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T22:21:40.1187707Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T22:21:40.1256223Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T22:21:40.1327678Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T22:21:40.1398810Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T22:21:40.1462114Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T22:21:40.1531360Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T22:21:40.1605395Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T22:21:40.1702918Z Entering 'third_party/pocketfft' 2024-08-06T22:21:40.1771914Z Entering 'third_party/protobuf' 2024-08-06T22:21:40.1842982Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-06T22:21:40.1914691Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-06T22:21:40.1984283Z Entering 'third_party/psimd' 2024-08-06T22:21:40.2055110Z Entering 'third_party/pthreadpool' 2024-08-06T22:21:40.2126515Z Entering 'third_party/pybind11' 2024-08-06T22:21:40.2286104Z Entering 'third_party/python-peachpy' 2024-08-06T22:21:40.2356938Z Entering 'third_party/sleef' 2024-08-06T22:21:40.2425246Z Entering 'third_party/tensorpipe' 2024-08-06T22:21:40.2489093Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-06T22:21:40.2559213Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-06T22:21:40.2624908Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-06T22:21:40.2690645Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T22:21:40.2755104Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T22:21:40.2848142Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2024-08-06T22:21:40.2885768Z http.https://github.com/.extraheader 2024-08-06T22:21:40.2896120Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2024-08-06T22:21:40.2944716Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2024-08-06T22:21:40.3343009Z Entering 'android/libs/fbjni' 2024-08-06T22:21:40.3388201Z http.https://github.com/.extraheader 2024-08-06T22:21:40.3432196Z Entering 'third_party/FP16' 2024-08-06T22:21:40.3477873Z http.https://github.com/.extraheader 2024-08-06T22:21:40.3520207Z Entering 'third_party/FXdiv' 2024-08-06T22:21:40.3563342Z http.https://github.com/.extraheader 2024-08-06T22:21:40.3606082Z Entering 'third_party/NNPACK' 2024-08-06T22:21:40.3648472Z http.https://github.com/.extraheader 2024-08-06T22:21:40.3691896Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-06T22:21:40.3737735Z http.https://github.com/.extraheader 2024-08-06T22:21:40.3781900Z Entering 'third_party/XNNPACK' 2024-08-06T22:21:40.3825728Z http.https://github.com/.extraheader 2024-08-06T22:21:40.3883473Z Entering 'third_party/benchmark' 2024-08-06T22:21:40.3926838Z http.https://github.com/.extraheader 2024-08-06T22:21:40.3969310Z Entering 'third_party/cpp-httplib' 2024-08-06T22:21:40.4013335Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4057375Z Entering 'third_party/cpuinfo' 2024-08-06T22:21:40.4102107Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4144932Z Entering 'third_party/cudnn_frontend' 2024-08-06T22:21:40.4193257Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4237746Z Entering 'third_party/cutlass' 2024-08-06T22:21:40.4281707Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4332197Z Entering 'third_party/eigen' 2024-08-06T22:21:40.4377376Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4423880Z Entering 'third_party/fbgemm' 2024-08-06T22:21:40.4467335Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4509471Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-06T22:21:40.4553097Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4597220Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-06T22:21:40.4639752Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4683753Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-06T22:21:40.4726968Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4776804Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-06T22:21:40.4820144Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4865679Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-06T22:21:40.4910017Z http.https://github.com/.extraheader 2024-08-06T22:21:40.4955595Z Entering 'third_party/flatbuffers' 2024-08-06T22:21:40.4999455Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5044289Z Entering 'third_party/fmt' 2024-08-06T22:21:40.5087693Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5133150Z Entering 'third_party/foxi' 2024-08-06T22:21:40.5178318Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5222078Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-06T22:21:40.5265320Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5308251Z Entering 'third_party/gloo' 2024-08-06T22:21:40.5351480Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5395702Z Entering 'third_party/googletest' 2024-08-06T22:21:40.5441119Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5485406Z Entering 'third_party/ideep' 2024-08-06T22:21:40.5533978Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5574493Z Entering 'third_party/ideep/mkl-dnn' 2024-08-06T22:21:40.5618148Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5670389Z Entering 'third_party/ittapi' 2024-08-06T22:21:40.5713910Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5756598Z Entering 'third_party/kineto' 2024-08-06T22:21:40.5806090Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5848154Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-06T22:21:40.5893774Z http.https://github.com/.extraheader 2024-08-06T22:21:40.5936765Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-06T22:21:40.5980354Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6026527Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-06T22:21:40.6068930Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6117110Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-06T22:21:40.6159541Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6204117Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-06T22:21:40.6246585Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6292843Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-06T22:21:40.6336588Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6387277Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-06T22:21:40.6435182Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6479233Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-06T22:21:40.6521939Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6567097Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-06T22:21:40.6611013Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6657339Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-06T22:21:40.6701913Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6750009Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-06T22:21:40.6793051Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6836544Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-06T22:21:40.6879075Z http.https://github.com/.extraheader 2024-08-06T22:21:40.6925579Z Entering 'third_party/mimalloc' 2024-08-06T22:21:40.6969082Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7012695Z Entering 'third_party/nccl/nccl' 2024-08-06T22:21:40.7056493Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7100771Z Entering 'third_party/nlohmann' 2024-08-06T22:21:40.7145725Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7189187Z Entering 'third_party/onnx' 2024-08-06T22:21:40.7233207Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7292936Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-06T22:21:40.7335793Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7379867Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-06T22:21:40.7423971Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7472824Z Entering 'third_party/opentelemetry-cpp' 2024-08-06T22:21:40.7521498Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7565664Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-06T22:21:40.7609092Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7651777Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-06T22:21:40.7693363Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7737776Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-06T22:21:40.7778996Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7822531Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-06T22:21:40.7864445Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7909361Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-06T22:21:40.7951367Z http.https://github.com/.extraheader 2024-08-06T22:21:40.7994362Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-06T22:21:40.8037325Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8080640Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-06T22:21:40.8123814Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8166062Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-06T22:21:40.8213997Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8259111Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-06T22:21:40.8305265Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8353125Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-06T22:21:40.8396798Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8462321Z Entering 'third_party/pocketfft' 2024-08-06T22:21:40.8507312Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8550551Z Entering 'third_party/protobuf' 2024-08-06T22:21:40.8594987Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8638993Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-06T22:21:40.8683265Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8726646Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-06T22:21:40.8770076Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8817203Z Entering 'third_party/psimd' 2024-08-06T22:21:40.8865350Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8907617Z Entering 'third_party/pthreadpool' 2024-08-06T22:21:40.8955736Z http.https://github.com/.extraheader 2024-08-06T22:21:40.8998558Z Entering 'third_party/pybind11' 2024-08-06T22:21:40.9041795Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9084334Z Entering 'third_party/python-peachpy' 2024-08-06T22:21:40.9128171Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9169930Z Entering 'third_party/sleef' 2024-08-06T22:21:40.9213614Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9256161Z Entering 'third_party/tensorpipe' 2024-08-06T22:21:40.9300013Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9341288Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-06T22:21:40.9383677Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9427778Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-06T22:21:40.9471540Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9513977Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-06T22:21:40.9555348Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9598178Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-06T22:21:40.9639656Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9680534Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-06T22:21:40.9724589Z http.https://github.com/.extraheader 2024-08-06T22:21:40.9881795Z A job completed hook has been configured by the self-hosted runner administrator 2024-08-06T22:21:40.9910298Z ##[group]Run '/home/ec2-user/runner-scripts/after_job.sh' 2024-08-06T22:21:40.9917957Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-06T22:21:40.9918297Z ##[endgroup] 2024-08-06T22:21:47.8426940Z Cleaning up orphan processes