2024-08-08T20:52:59.9463541Z Current runner version: '2.319.0' 2024-08-08T20:52:59.9470041Z Runner name: 'i-09ba204a38896644d' 2024-08-08T20:52:59.9470987Z Runner group name: 'Default' 2024-08-08T20:52:59.9471936Z Machine name: 'ip-10-0-41-199' 2024-08-08T20:52:59.9490312Z ##[group]GITHUB_TOKEN Permissions 2024-08-08T20:52:59.9492282Z Actions: read 2024-08-08T20:52:59.9492816Z Attestations: read 2024-08-08T20:52:59.9493351Z Checks: read 2024-08-08T20:52:59.9493904Z Contents: read 2024-08-08T20:52:59.9494402Z Deployments: read 2024-08-08T20:52:59.9494934Z Discussions: read 2024-08-08T20:52:59.9495587Z Issues: read 2024-08-08T20:52:59.9496061Z Metadata: read 2024-08-08T20:52:59.9496572Z Packages: read 2024-08-08T20:52:59.9497132Z Pages: read 2024-08-08T20:52:59.9497617Z PullRequests: read 2024-08-08T20:52:59.9498178Z RepositoryProjects: read 2024-08-08T20:52:59.9498871Z SecurityEvents: read 2024-08-08T20:52:59.9499386Z Statuses: read 2024-08-08T20:52:59.9499895Z ##[endgroup] 2024-08-08T20:52:59.9503203Z Secret source: Actions 2024-08-08T20:52:59.9504075Z Prepare workflow directory 2024-08-08T20:53:00.4629400Z Prepare all required actions 2024-08-08T20:53:00.4802924Z Getting action download info 2024-08-08T20:53:00.6684127Z Download action repository 'pytorch/test-infra@main' (SHA:0b5c20199b94c3b12545de0d354b54a297b0e061) 2024-08-08T20:53:01.1032526Z Download action repository 'pytorch/pytorch@main' (SHA:7bd0732cbdca4ae572a8b4de2017cdfdff158d55) 2024-08-08T20:53:05.1318111Z Download action repository 'aws-actions/configure-aws-credentials@v3' (SHA:50ac8dd1e1b10d09dac7b8727528b91bed831ac0) 2024-08-08T20:53:05.3246037Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2024-08-08T20:53:05.7092286Z Getting action download info 2024-08-08T20:53:05.7996141Z Download action repository 'malfet/checkout@silent-checkout' (SHA:e07af140b3ccefc05679e3755b9db68f4ee4589c) 2024-08-08T20:53:06.0027915Z Getting action download info 2024-08-08T20:53:06.1186023Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2024-08-08T20:53:06.2498254Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/tags/ciflow/trunk/132710 (b9d86fa89636e301796d4201f36d86c73f6e49bc) 2024-08-08T20:53:06.2500412Z ##[group] Inputs 2024-08-08T20:53:06.2500887Z build-environment: linux-focal-cuda12.4-py3.10-gcc9-sm86 2024-08-08T20:53:06.2503350Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]} 2024-08-08T20:53:06.2506296Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:53:06.2507323Z sync-tag: 2024-08-08T20:53:06.2508095Z timeout-minutes: 240 2024-08-08T20:53:06.2508426Z use-gha: 2024-08-08T20:53:06.2508705Z dashboard-tag: 2024-08-08T20:53:06.2509017Z s3-bucket: gha-artifacts 2024-08-08T20:53:06.2509371Z aws-role-to-assume: 2024-08-08T20:53:06.2509691Z ##[endgroup] 2024-08-08T20:53:06.2510607Z Complete job name: linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T20:53:06.3185687Z A job started hook has been configured by the self-hosted runner administrator 2024-08-08T20:53:06.3323847Z ##[group]Run '/home/ec2-user/runner-scripts/before_job.sh' 2024-08-08T20:53:06.3335737Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:53:06.3336257Z ##[endgroup] 2024-08-08T20:53:08.1348050Z Runner Type: amz2023.linux.g5.4xlarge.nvidia.gpu 2024-08-08T20:53:08.1348586Z Instance Type: g5.4xlarge 2024-08-08T20:53:08.1349780Z AMI Name: al2023-ami-2023.5.20240701.0-kernel-6.1-x86_64 2024-08-08T20:53:08.1350335Z AMI ID: ami-06c68f701d8090592 2024-08-08T20:53:13.9387507Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2024-08-08T20:53:13.9388080Z with: 2024-08-08T20:53:13.9388742Z github-secret: *** 2024-08-08T20:53:13.9389661Z instructions: All testing is done inside the container, to start an interactive session run: docker exec -it $(docker container ps --format '{{.ID}}') bash 2024-08-08T20:53:13.9390645Z activate-with-label: false 2024-08-08T20:53:13.9391004Z label: with-ssh 2024-08-08T20:53:13.9391319Z remove-existing-keys: true 2024-08-08T20:53:13.9391679Z fail-silently: true 2024-08-08T20:53:13.9392084Z env: 2024-08-08T20:53:13.9392349Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:53:13.9392688Z ##[endgroup] 2024-08-08T20:53:14.0290508Z Please see https://github.com/pytorch/pytorch/wiki/Debugging-using-with-ssh-for-Github-Actions for more info. 2024-08-08T20:53:14.0292970Z ciflow reference detected, attempting to extract PR number 2024-08-08T20:53:14.4974997Z Grabbing public ssh keys from https://github.com/pytorch-bot[bot].keys 2024-08-08T20:53:14.5875367Z No SSH keys found for user pytorch-bot[bot] 2024-08-08T20:53:14.5876006Z Grabbing public ssh keys from https://github.com/drisspg.keys 2024-08-08T20:53:14.6525410Z ~/.ssh/authorized_keys file found on node, removing ~/.ssh and starting fresh 2024-08-08T20:53:14.6539539Z Public keys pulled and installed to /home/ec2-user/.ssh/authorized_keys 2024-08-08T20:53:14.6564670Z Login using: ssh ec2-user@ec2-44-203-92-203.compute-1.amazonaws.com 2024-08-08T20:53:14.6565491Z All testing is done inside the container, to start an interactive session run: 2024-08-08T20:53:14.6566304Z docker exec -it $(docker container ps --format '{{.ID}}') bash 2024-08-08T20:53:14.6698081Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2024-08-08T20:53:14.6698612Z with: 2024-08-08T20:53:14.6698881Z submodules: recursive 2024-08-08T20:53:14.6699224Z fetch-depth: 0 2024-08-08T20:53:14.6699496Z env: 2024-08-08T20:53:14.6699758Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:53:14.6700086Z ##[endgroup] 2024-08-08T20:53:14.6900296Z ##[group]Run retry () { 2024-08-08T20:53:14.6900676Z retry () { 2024-08-08T20:53:14.6901179Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2024-08-08T20:53:14.6901722Z } 2024-08-08T20:53:14.6902021Z echo "${GITHUB_WORKSPACE}" 2024-08-08T20:53:14.6902434Z if [ -z "${NO_SUDO}" ]; then 2024-08-08T20:53:14.6902903Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2024-08-08T20:53:14.6903344Z else 2024-08-08T20:53:14.6903682Z  retry rm -rf "${GITHUB_WORKSPACE}" 2024-08-08T20:53:14.6904102Z fi 2024-08-08T20:53:14.6904433Z mkdir "${GITHUB_WORKSPACE}" 2024-08-08T20:53:14.6914794Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:53:14.6915601Z env: 2024-08-08T20:53:14.6915899Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:53:14.6916235Z NO_SUDO: 2024-08-08T20:53:14.6916511Z ##[endgroup] 2024-08-08T20:53:14.6948103Z /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-08T20:53:19.3420165Z ##[group]Run malfet/checkout@silent-checkout 2024-08-08T20:53:19.3420595Z with: 2024-08-08T20:53:19.3420919Z ref: b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T20:53:19.3421355Z fetch-depth: 0 2024-08-08T20:53:19.3421660Z submodules: recursive 2024-08-08T20:53:19.3422002Z quiet-checkout: true 2024-08-08T20:53:19.3422352Z repository: pytorch/pytorch 2024-08-08T20:53:19.3422824Z token: *** 2024-08-08T20:53:19.3423126Z ssh-strict: true 2024-08-08T20:53:19.3423457Z persist-credentials: true 2024-08-08T20:53:19.3423808Z clean: true 2024-08-08T20:53:19.3424146Z sparse-checkout-cone-mode: true 2024-08-08T20:53:19.3424533Z lfs: false 2024-08-08T20:53:19.3424831Z set-safe-directory: true 2024-08-08T20:53:19.3425173Z env: 2024-08-08T20:53:19.3425661Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:53:19.3426000Z ##[endgroup] 2024-08-08T20:53:19.4404961Z Syncing repository: pytorch/pytorch 2024-08-08T20:53:19.4406488Z ##[group]Getting Git version info 2024-08-08T20:53:19.4407217Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2024-08-08T20:53:19.4408059Z [command]/usr/bin/git version 2024-08-08T20:53:19.4408420Z git version 2.40.1 2024-08-08T20:53:19.4412290Z ##[endgroup] 2024-08-08T20:53:19.4429690Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/22bbfd1c-3293-48ca-9a0f-6ed81db68907' before making global git config changes 2024-08-08T20:53:19.4431111Z Adding repository directory to the temporary git global config as a safe directory 2024-08-08T20:53:19.4432397Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-08T20:53:19.4483965Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2024-08-08T20:53:19.4487649Z ##[group]Initializing the repository 2024-08-08T20:53:19.4490318Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-08T20:53:19.4535936Z hint: Using 'master' as the name for the initial branch. This default branch name 2024-08-08T20:53:19.4536871Z hint: is subject to change. To configure the initial branch name to use in all 2024-08-08T20:53:19.4537655Z hint: of your new repositories, which will suppress this warning, call: 2024-08-08T20:53:19.4538203Z hint: 2024-08-08T20:53:19.4538675Z hint: git config --global init.defaultBranch 2024-08-08T20:53:19.4539143Z hint: 2024-08-08T20:53:19.4539654Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2024-08-08T20:53:19.4540490Z hint: 'development'. The just-created branch can be renamed via this command: 2024-08-08T20:53:19.4541514Z hint: 2024-08-08T20:53:19.4541867Z hint: git branch -m 2024-08-08T20:53:19.4542601Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2024-08-08T20:53:19.4549205Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2024-08-08T20:53:19.4593071Z ##[endgroup] 2024-08-08T20:53:19.4593643Z ##[group]Disabling automatic garbage collection 2024-08-08T20:53:19.4595932Z [command]/usr/bin/git config --local gc.auto 0 2024-08-08T20:53:19.4639248Z ##[endgroup] 2024-08-08T20:53:19.4639760Z ##[group]Setting up auth 2024-08-08T20:53:19.4645144Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2024-08-08T20:53:19.4688737Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2024-08-08T20:53:19.5039920Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2024-08-08T20:53:19.5081893Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2024-08-08T20:53:19.5438855Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2024-08-08T20:53:19.5497844Z ##[endgroup] 2024-08-08T20:53:19.5498403Z ##[group]Fetching the repository 2024-08-08T20:53:19.5503947Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --progress --no-recurse-submodules --quiet origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2024-08-08T20:53:21.6645702Z remote: Enumerating objects: 1008986 2024-08-08T20:53:21.6646333Z remote: Enumerating objects: 1011899, done. 2024-08-08T20:53:21.6647606Z remote: Counting objects: 0% (1/2913) 2024-08-08T20:53:21.6648370Z remote: Counting objects: 1% (30/2913) 2024-08-08T20:53:21.6649545Z remote: Counting objects: 2% (59/2913) 2024-08-08T20:53:21.6650459Z remote: Counting objects: 3% (88/2913) 2024-08-08T20:53:21.6651428Z remote: Counting objects: 4% (117/2913) 2024-08-08T20:53:21.6652129Z remote: Counting objects: 5% (146/2913) 2024-08-08T20:53:21.6652967Z remote: Counting objects: 6% (175/2913) 2024-08-08T20:53:21.6653454Z remote: Counting objects: 7% (204/2913) 2024-08-08T20:53:21.6654064Z remote: Counting objects: 8% (234/2913) 2024-08-08T20:53:21.6654579Z remote: Counting objects: 9% (263/2913) 2024-08-08T20:53:21.6655310Z remote: Counting objects: 10% (292/2913) 2024-08-08T20:53:21.6655810Z remote: Counting objects: 11% (321/2913) 2024-08-08T20:53:21.6656405Z remote: Counting objects: 12% (350/2913) 2024-08-08T20:53:21.6657055Z remote: Counting objects: 13% (379/2913) 2024-08-08T20:53:21.6657823Z remote: Counting objects: 14% (408/2913) 2024-08-08T20:53:21.6658338Z remote: Counting objects: 15% (437/2913) 2024-08-08T20:53:21.6659370Z remote: Counting objects: 16% (467/2913) 2024-08-08T20:53:21.6659931Z remote: Counting objects: 17% (496/2913) 2024-08-08T20:53:21.6660706Z remote: Counting objects: 18% (525/2913) 2024-08-08T20:53:21.6661994Z remote: Counting objects: 19% (554/2913) 2024-08-08T20:53:21.6662533Z remote: Counting objects: 20% (583/2913) 2024-08-08T20:53:21.6663240Z remote: Counting objects: 21% (612/2913) 2024-08-08T20:53:21.6664555Z remote: Counting objects: 22% (641/2913) 2024-08-08T20:53:21.6667300Z remote: Counting objects: 23% (670/2913) 2024-08-08T20:53:21.6668130Z remote: Counting objects: 24% (700/2913) 2024-08-08T20:53:21.6670668Z remote: Counting objects: 25% (729/2913) 2024-08-08T20:53:21.6671213Z remote: Counting objects: 26% (758/2913) 2024-08-08T20:53:21.6671779Z remote: Counting objects: 27% (787/2913) 2024-08-08T20:53:21.6672841Z remote: Counting objects: 28% (816/2913) 2024-08-08T20:53:21.6673690Z remote: Counting objects: 29% (845/2913) 2024-08-08T20:53:21.6674189Z remote: Counting objects: 30% (874/2913) 2024-08-08T20:53:21.6674990Z remote: Counting objects: 31% (904/2913) 2024-08-08T20:53:21.6675803Z remote: Counting objects: 32% (933/2913) 2024-08-08T20:53:21.6676304Z remote: Counting objects: 33% (962/2913) 2024-08-08T20:53:21.6676881Z remote: Counting objects: 34% (991/2913) 2024-08-08T20:53:21.6677547Z remote: Counting objects: 35% (1020/2913) 2024-08-08T20:53:21.6678454Z remote: Counting objects: 36% (1049/2913) 2024-08-08T20:53:21.6678976Z remote: Counting objects: 37% (1078/2913) 2024-08-08T20:53:21.6679561Z remote: Counting objects: 38% (1107/2913) 2024-08-08T20:53:21.6680255Z remote: Counting objects: 39% (1137/2913) 2024-08-08T20:53:21.6681084Z remote: Counting objects: 40% (1166/2913) 2024-08-08T20:53:21.6681610Z remote: Counting objects: 41% (1195/2913) 2024-08-08T20:53:21.6682226Z remote: Counting objects: 42% (1224/2913) 2024-08-08T20:53:21.6682928Z remote: Counting objects: 43% (1253/2913) 2024-08-08T20:53:21.6683728Z remote: Counting objects: 44% (1282/2913) 2024-08-08T20:53:21.6684246Z remote: Counting objects: 45% (1311/2913) 2024-08-08T20:53:21.6684909Z remote: Counting objects: 46% (1340/2913) 2024-08-08T20:53:21.6685493Z remote: Counting objects: 47% (1370/2913) 2024-08-08T20:53:21.6686317Z remote: Counting objects: 48% (1399/2913) 2024-08-08T20:53:21.6686824Z remote: Counting objects: 49% (1428/2913) 2024-08-08T20:53:21.6687469Z remote: Counting objects: 50% (1457/2913) 2024-08-08T20:53:21.6688131Z remote: Counting objects: 51% (1486/2913) 2024-08-08T20:53:21.6688938Z remote: Counting objects: 52% (1515/2913) 2024-08-08T20:53:21.6689458Z remote: Counting objects: 53% (1544/2913) 2024-08-08T20:53:21.6690124Z remote: Counting objects: 54% (1574/2913) 2024-08-08T20:53:21.6690812Z remote: Counting objects: 55% (1603/2913) 2024-08-08T20:53:21.6691754Z remote: Counting objects: 56% (1632/2913) 2024-08-08T20:53:21.6692343Z remote: Counting objects: 57% (1661/2913) 2024-08-08T20:53:21.6693004Z remote: Counting objects: 58% (1690/2913) 2024-08-08T20:53:21.6693863Z remote: Counting objects: 59% (1719/2913) 2024-08-08T20:53:21.6694402Z remote: Counting objects: 60% (1748/2913) 2024-08-08T20:53:21.6695011Z remote: Counting objects: 61% (1777/2913) 2024-08-08T20:53:21.6695640Z remote: Counting objects: 62% (1807/2913) 2024-08-08T20:53:21.6696491Z remote: Counting objects: 63% (1836/2913) 2024-08-08T20:53:21.6697003Z remote: Counting objects: 64% (1865/2913) 2024-08-08T20:53:21.6697581Z remote: Counting objects: 65% (1894/2913) 2024-08-08T20:53:21.6698293Z remote: Counting objects: 66% (1923/2913) 2024-08-08T20:53:21.6699251Z remote: Counting objects: 67% (1952/2913) 2024-08-08T20:53:21.6699755Z remote: Counting objects: 68% (1981/2913) 2024-08-08T20:53:21.6700406Z remote: Counting objects: 69% (2010/2913) 2024-08-08T20:53:21.6701000Z remote: Counting objects: 70% (2040/2913) 2024-08-08T20:53:21.6701846Z remote: Counting objects: 71% (2069/2913) 2024-08-08T20:53:21.6702397Z remote: Counting objects: 72% (2098/2913) 2024-08-08T20:53:21.6703053Z remote: Counting objects: 73% (2127/2913) 2024-08-08T20:53:21.6703861Z remote: Counting objects: 74% (2156/2913) 2024-08-08T20:53:21.6704393Z remote: Counting objects: 75% (2185/2913) 2024-08-08T20:53:21.6704964Z remote: Counting objects: 76% (2214/2913) 2024-08-08T20:53:21.6705478Z remote: Counting objects: 77% (2244/2913) 2024-08-08T20:53:21.6705989Z remote: Counting objects: 78% (2273/2913) 2024-08-08T20:53:21.6706662Z remote: Counting objects: 79% (2302/2913) 2024-08-08T20:53:21.6707200Z remote: Counting objects: 80% (2331/2913) 2024-08-08T20:53:21.6707705Z remote: Counting objects: 81% (2360/2913) 2024-08-08T20:53:21.6708216Z remote: Counting objects: 82% (2389/2913) 2024-08-08T20:53:21.6708716Z remote: Counting objects: 83% (2418/2913) 2024-08-08T20:53:21.6709214Z remote: Counting objects: 84% (2447/2913) 2024-08-08T20:53:21.6709717Z remote: Counting objects: 85% (2477/2913) 2024-08-08T20:53:21.6710219Z remote: Counting objects: 86% (2506/2913) 2024-08-08T20:53:21.6710717Z remote: Counting objects: 87% (2535/2913) 2024-08-08T20:53:21.6711220Z remote: Counting objects: 88% (2564/2913) 2024-08-08T20:53:21.6711722Z remote: Counting objects: 89% (2593/2913) 2024-08-08T20:53:21.6712375Z remote: Counting objects: 90% (2622/2913) 2024-08-08T20:53:21.6712883Z remote: Counting objects: 91% (2651/2913) 2024-08-08T20:53:21.6713398Z remote: Counting objects: 92% (2680/2913) 2024-08-08T20:53:21.6713903Z remote: Counting objects: 93% (2710/2913) 2024-08-08T20:53:21.6714417Z remote: Counting objects: 94% (2739/2913) 2024-08-08T20:53:21.6714924Z remote: Counting objects: 95% (2768/2913) 2024-08-08T20:53:21.6715741Z remote: Counting objects: 96% (2797/2913) 2024-08-08T20:53:21.6716246Z remote: Counting objects: 97% (2826/2913) 2024-08-08T20:53:21.6716748Z remote: Counting objects: 98% (2855/2913) 2024-08-08T20:53:21.6717245Z remote: Counting objects: 99% (2884/2913) 2024-08-08T20:53:21.6717750Z remote: Counting objects: 100% (2913/2913) 2024-08-08T20:53:21.6718293Z remote: Counting objects: 100% (2913/2913), done. 2024-08-08T20:53:21.7002059Z remote: Compressing objects: 0% (1/1575) 2024-08-08T20:53:21.7354278Z remote: Compressing objects: 1% (16/1575) 2024-08-08T20:53:21.7553894Z remote: Compressing objects: 2% (32/1575) 2024-08-08T20:53:21.7727230Z remote: Compressing objects: 3% (48/1575) 2024-08-08T20:53:21.9090561Z remote: Compressing objects: 4% (63/1575) 2024-08-08T20:53:22.1068177Z remote: Compressing objects: 5% (79/1575) 2024-08-08T20:53:22.2137434Z remote: Compressing objects: 6% (95/1575) 2024-08-08T20:53:22.2616041Z remote: Compressing objects: 7% (111/1575) 2024-08-08T20:53:22.2837715Z remote: Compressing objects: 8% (126/1575) 2024-08-08T20:53:22.3281508Z remote: Compressing objects: 9% (142/1575) 2024-08-08T20:53:22.3694717Z remote: Compressing objects: 10% (158/1575) 2024-08-08T20:53:22.4064817Z remote: Compressing objects: 11% (174/1575) 2024-08-08T20:53:22.4311042Z remote: Compressing objects: 12% (189/1575) 2024-08-08T20:53:22.4643491Z remote: Compressing objects: 13% (205/1575) 2024-08-08T20:53:22.4854517Z remote: Compressing objects: 14% (221/1575) 2024-08-08T20:53:22.5087227Z remote: Compressing objects: 15% (237/1575) 2024-08-08T20:53:22.5264846Z remote: Compressing objects: 16% (252/1575) 2024-08-08T20:53:22.5391467Z remote: Compressing objects: 17% (268/1575) 2024-08-08T20:53:22.5474897Z remote: Compressing objects: 18% (284/1575) 2024-08-08T20:53:22.5533057Z remote: Compressing objects: 19% (300/1575) 2024-08-08T20:53:22.5567643Z remote: Compressing objects: 20% (315/1575) 2024-08-08T20:53:22.5597107Z remote: Compressing objects: 21% (331/1575) 2024-08-08T20:53:22.5605525Z remote: Compressing objects: 22% (347/1575) 2024-08-08T20:53:22.5644017Z remote: Compressing objects: 23% (363/1575) 2024-08-08T20:53:22.5678927Z remote: Compressing objects: 24% (378/1575) 2024-08-08T20:53:22.5690946Z remote: Compressing objects: 25% (394/1575) 2024-08-08T20:53:22.5707380Z remote: Compressing objects: 26% (410/1575) 2024-08-08T20:53:22.5731934Z remote: Compressing objects: 27% (426/1575) 2024-08-08T20:53:22.5756159Z remote: Compressing objects: 28% (441/1575) 2024-08-08T20:53:22.5798156Z remote: Compressing objects: 29% (457/1575) 2024-08-08T20:53:22.5804876Z remote: Compressing objects: 30% (473/1575) 2024-08-08T20:53:22.5812775Z remote: Compressing objects: 31% (489/1575) 2024-08-08T20:53:22.5830872Z remote: Compressing objects: 32% (504/1575) 2024-08-08T20:53:22.5855247Z remote: Compressing objects: 33% (520/1575) 2024-08-08T20:53:22.5886390Z remote: Compressing objects: 34% (536/1575) 2024-08-08T20:53:22.5900039Z remote: Compressing objects: 35% (552/1575) 2024-08-08T20:53:22.5925391Z remote: Compressing objects: 36% (567/1575) 2024-08-08T20:53:22.5945911Z remote: Compressing objects: 37% (583/1575) 2024-08-08T20:53:22.5966782Z remote: Compressing objects: 38% (599/1575) 2024-08-08T20:53:22.5980210Z remote: Compressing objects: 39% (615/1575) 2024-08-08T20:53:22.5991435Z remote: Compressing objects: 40% (630/1575) 2024-08-08T20:53:22.6007481Z remote: Compressing objects: 41% (646/1575) 2024-08-08T20:53:22.6027943Z remote: Compressing objects: 42% (662/1575) 2024-08-08T20:53:22.6043570Z remote: Compressing objects: 43% (678/1575) 2024-08-08T20:53:22.6057454Z remote: Compressing objects: 44% (693/1575) 2024-08-08T20:53:22.6073640Z remote: Compressing objects: 45% (709/1575) 2024-08-08T20:53:22.6086099Z remote: Compressing objects: 46% (725/1575) 2024-08-08T20:53:22.6097633Z remote: Compressing objects: 47% (741/1575) 2024-08-08T20:53:22.6108485Z remote: Compressing objects: 48% (756/1575) 2024-08-08T20:53:22.6117930Z remote: Compressing objects: 49% (772/1575) 2024-08-08T20:53:22.6123623Z remote: Compressing objects: 50% (788/1575) 2024-08-08T20:53:22.6130278Z remote: Compressing objects: 51% (804/1575) 2024-08-08T20:53:22.6136360Z remote: Compressing objects: 52% (819/1575) 2024-08-08T20:53:22.6142498Z remote: Compressing objects: 53% (835/1575) 2024-08-08T20:53:22.6147445Z remote: Compressing objects: 54% (851/1575) 2024-08-08T20:53:22.6152302Z remote: Compressing objects: 55% (867/1575) 2024-08-08T20:53:22.6157398Z remote: Compressing objects: 56% (882/1575) 2024-08-08T20:53:22.6162310Z remote: Compressing objects: 57% (898/1575) 2024-08-08T20:53:22.6166586Z remote: Compressing objects: 58% (914/1575) 2024-08-08T20:53:22.6170720Z remote: Compressing objects: 59% (930/1575) 2024-08-08T20:53:22.6173202Z remote: Compressing objects: 60% (945/1575) 2024-08-08T20:53:22.6175565Z remote: Compressing objects: 61% (961/1575) 2024-08-08T20:53:22.6177064Z remote: Compressing objects: 62% (977/1575) 2024-08-08T20:53:22.6179434Z remote: Compressing objects: 63% (993/1575) 2024-08-08T20:53:22.6180713Z remote: Compressing objects: 64% (1008/1575) 2024-08-08T20:53:22.6182432Z remote: Compressing objects: 65% (1024/1575) 2024-08-08T20:53:22.6183134Z remote: Compressing objects: 66% (1040/1575) 2024-08-08T20:53:22.6183684Z remote: Compressing objects: 67% (1056/1575) 2024-08-08T20:53:22.6184227Z remote: Compressing objects: 68% (1071/1575) 2024-08-08T20:53:22.6186595Z remote: Compressing objects: 69% (1087/1575) 2024-08-08T20:53:22.6195455Z remote: Compressing objects: 70% (1103/1575) 2024-08-08T20:53:22.6201665Z remote: Compressing objects: 71% (1119/1575) 2024-08-08T20:53:22.6211837Z remote: Compressing objects: 72% (1134/1575) 2024-08-08T20:53:22.6218640Z remote: Compressing objects: 73% (1150/1575) 2024-08-08T20:53:22.6225340Z remote: Compressing objects: 74% (1166/1575) 2024-08-08T20:53:22.6228973Z remote: Compressing objects: 75% (1182/1575) 2024-08-08T20:53:22.6232592Z remote: Compressing objects: 76% (1197/1575) 2024-08-08T20:53:22.6238802Z remote: Compressing objects: 77% (1213/1575) 2024-08-08T20:53:22.6242974Z remote: Compressing objects: 78% (1229/1575) 2024-08-08T20:53:22.6247339Z remote: Compressing objects: 79% (1245/1575) 2024-08-08T20:53:22.6251772Z remote: Compressing objects: 80% (1260/1575) 2024-08-08T20:53:22.6255558Z remote: Compressing objects: 81% (1276/1575) 2024-08-08T20:53:22.6259263Z remote: Compressing objects: 82% (1292/1575) 2024-08-08T20:53:22.6263272Z remote: Compressing objects: 83% (1308/1575) 2024-08-08T20:53:22.6267458Z remote: Compressing objects: 84% (1323/1575) 2024-08-08T20:53:22.6270889Z remote: Compressing objects: 85% (1339/1575) 2024-08-08T20:53:22.6273767Z remote: Compressing objects: 86% (1355/1575) 2024-08-08T20:53:22.6277046Z remote: Compressing objects: 87% (1371/1575) 2024-08-08T20:53:22.6279293Z remote: Compressing objects: 88% (1386/1575) 2024-08-08T20:53:22.6281248Z remote: Compressing objects: 89% (1402/1575) 2024-08-08T20:53:22.6283568Z remote: Compressing objects: 90% (1418/1575) 2024-08-08T20:53:22.6284215Z remote: Compressing objects: 91% (1434/1575) 2024-08-08T20:53:22.6287445Z remote: Compressing objects: 92% (1449/1575) 2024-08-08T20:53:22.6288302Z remote: Compressing objects: 93% (1465/1575) 2024-08-08T20:53:22.6289488Z remote: Compressing objects: 94% (1481/1575) 2024-08-08T20:53:22.6292293Z remote: Compressing objects: 95% (1497/1575) 2024-08-08T20:53:22.6294491Z remote: Compressing objects: 96% (1512/1575) 2024-08-08T20:53:22.6295646Z remote: Compressing objects: 97% (1528/1575) 2024-08-08T20:53:22.6297663Z remote: Compressing objects: 98% (1544/1575) 2024-08-08T20:53:22.6300000Z remote: Compressing objects: 99% (1560/1575) 2024-08-08T20:53:22.6300795Z remote: Compressing objects: 100% (1575/1575) 2024-08-08T20:53:22.6301367Z remote: Compressing objects: 100% (1575/1575), done. 2024-08-08T20:53:44.7841026Z remote: Total 1011899 (delta 1916), reused 2096 (delta 1334), pack-reused 1008986 2024-08-08T20:54:08.4523601Z [command]/usr/bin/git rev-parse --verify --quiet b9d86fa89636e301796d4201f36d86c73f6e49bc^{object} 2024-08-08T20:54:08.4561949Z b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T20:54:08.4567903Z ##[endgroup] 2024-08-08T20:54:08.4568445Z ##[group]Determining the checkout info 2024-08-08T20:54:08.4569404Z ##[endgroup] 2024-08-08T20:54:08.4569927Z ##[group]Checking out the ref 2024-08-08T20:54:08.4573282Z [command]/usr/bin/git checkout --quiet --force b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T20:54:10.1353492Z ##[endgroup] 2024-08-08T20:54:10.1354142Z ##[group]Setting up auth for fetching submodules 2024-08-08T20:54:10.1358987Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2024-08-08T20:54:10.1420719Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2024-08-08T20:54:10.1462000Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2024-08-08T20:54:10.1503131Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2024-08-08T20:54:10.1541523Z ##[endgroup] 2024-08-08T20:54:10.1542046Z ##[group]Fetching submodules 2024-08-08T20:54:10.1545400Z [command]/usr/bin/git submodule sync --recursive 2024-08-08T20:54:10.1924901Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2024-08-08T20:54:10.2289295Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2024-08-08T20:54:10.2291174Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2024-08-08T20:54:10.2294623Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2024-08-08T20:54:10.2298652Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2024-08-08T20:54:10.2302163Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2024-08-08T20:54:10.2305065Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2024-08-08T20:54:10.2308560Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2024-08-08T20:54:10.2312539Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2024-08-08T20:54:10.2316419Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2024-08-08T20:54:10.2320664Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2024-08-08T20:54:10.2324208Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2024-08-08T20:54:10.2328067Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2024-08-08T20:54:10.2331909Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2024-08-08T20:54:10.2336034Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2024-08-08T20:54:10.2339913Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2024-08-08T20:54:10.2343794Z Submodule 'third_party/foxi' (https://github.com/houseroad/foxi.git) registered for path 'third_party/foxi' 2024-08-08T20:54:10.2351392Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2024-08-08T20:54:10.2355605Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2024-08-08T20:54:10.2359961Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2024-08-08T20:54:10.2364343Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2024-08-08T20:54:10.2368777Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2024-08-08T20:54:10.2373189Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2024-08-08T20:54:10.2377831Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2024-08-08T20:54:10.2382456Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2024-08-08T20:54:10.2387096Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2024-08-08T20:54:10.2391834Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2024-08-08T20:54:10.2397078Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2024-08-08T20:54:10.2401754Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2024-08-08T20:54:10.2406906Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2024-08-08T20:54:10.2411907Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2024-08-08T20:54:10.2418543Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2024-08-08T20:54:10.2423337Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2024-08-08T20:54:10.2428553Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2024-08-08T20:54:10.2437163Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2024-08-08T20:54:10.2442606Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2024-08-08T20:54:10.2476297Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2024-08-08T20:54:10.5715940Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2024-08-08T20:54:10.7557911Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2024-08-08T20:54:10.9610801Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2024-08-08T20:54:11.1943505Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2024-08-08T20:54:13.2167868Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2024-08-08T20:54:24.8405465Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2024-08-08T20:54:25.2719087Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpp-httplib'... 2024-08-08T20:54:25.7735544Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2024-08-08T20:54:26.4055983Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2024-08-08T20:54:27.6467101Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2024-08-08T20:54:29.6651937Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2024-08-08T20:54:35.5198929Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2024-08-08T20:54:36.8042314Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2024-08-08T20:54:38.6091276Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2024-08-08T20:54:39.9118875Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/foxi'... 2024-08-08T20:54:40.1195553Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2024-08-08T20:54:40.5629165Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2024-08-08T20:54:40.9148573Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2024-08-08T20:54:42.0545067Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2024-08-08T20:54:42.4281106Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2024-08-08T20:54:42.6834401Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2024-08-08T20:54:44.4284871Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/mimalloc'... 2024-08-08T20:54:45.1886802Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2024-08-08T20:54:45.8513107Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2024-08-08T20:54:51.8659917Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2024-08-08T20:54:53.8658656Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2024-08-08T20:54:57.9190770Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2024-08-08T20:54:58.1593957Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2024-08-08T20:55:06.8543056Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2024-08-08T20:55:07.0841655Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2024-08-08T20:55:07.2962133Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2024-08-08T20:55:08.1780109Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2024-08-08T20:55:08.4578947Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2024-08-08T20:55:09.0595482Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2024-08-08T20:55:09.5742806Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2024-08-08T20:55:09.5892664Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2024-08-08T20:55:09.6009666Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2024-08-08T20:55:09.6323550Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2024-08-08T20:55:09.6759394Z Submodule path 'third_party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191' 2024-08-08T20:55:10.8526565Z Submodule path 'third_party/XNNPACK': checked out 'fcbf55af6cf28a4627bcd1f703ab7ad843f0f3a2' 2024-08-08T20:55:10.8813187Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2024-08-08T20:55:10.9341681Z Submodule path 'third_party/cpp-httplib': checked out '3b6597bba913d51161383657829b7e644e59c006' 2024-08-08T20:55:11.0475323Z Submodule path 'third_party/cpuinfo': checked out '3c8b1533ac03dd6531ab6e7b9245d488f13a82a5' 2024-08-08T20:55:11.0879126Z Submodule path 'third_party/cudnn_frontend': checked out '98ca4e1941fe3263f128f74f10063a3ea35c7019' 2024-08-08T20:55:11.7115664Z Submodule path 'third_party/cutlass': checked out 'bbe579a9e3beb6ea6626d9227ec32d0dae119a49' 2024-08-08T20:55:12.0004095Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2024-08-08T20:55:12.0910796Z Submodule path 'third_party/fbgemm': checked out 'dbc3157bf256f1339b3fa1fef2be89ac4078be0e' 2024-08-08T20:55:12.0930244Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2024-08-08T20:55:12.0933720Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T20:55:12.0937686Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/fbgemm/third_party/cutlass' 2024-08-08T20:55:12.0941284Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2024-08-08T20:55:12.0944963Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T20:55:12.0977481Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2024-08-08T20:55:13.0773118Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2024-08-08T20:55:13.6766501Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cutlass'... 2024-08-08T20:55:15.6823681Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2024-08-08T20:55:16.7777770Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2024-08-08T20:55:17.1290038Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2024-08-08T20:55:17.2401946Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2024-08-08T20:55:17.7479669Z Submodule path 'third_party/fbgemm/third_party/cutlass': checked out 'fc9ebc645b63f3a6bc80aaefde5c063fb72110d6' 2024-08-08T20:55:17.8173175Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2024-08-08T20:55:17.8326476Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '23f53b025b466d8ec3c45d52290d3442f7fbe6b1' 2024-08-08T20:55:17.9847983Z Submodule path 'third_party/flatbuffers': checked out '01834de25e4bf3975a9a00e816292b1ad0fe184b' 2024-08-08T20:55:18.0306216Z Submodule path 'third_party/fmt': checked out '0c9fce2ffefecfdce794e1859584e25877b7b592' 2024-08-08T20:55:18.0426609Z Submodule path 'third_party/foxi': checked out 'c278588e34e535f0bb8f00df3880d26928038cad' 2024-08-08T20:55:18.0882335Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2024-08-08T20:55:18.1206438Z Submodule path 'third_party/gloo': checked out '5354032ea08eadd7fc4456477f7f7c6308818509' 2024-08-08T20:55:18.1740078Z Submodule path 'third_party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2024-08-08T20:55:18.1897736Z Submodule path 'third_party/ideep': checked out '55ca0191687aaf19aca5cdb7881c791e3bea442b' 2024-08-08T20:55:18.1919884Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2024-08-08T20:55:18.1950509Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2024-08-08T20:55:32.3302223Z Submodule path 'third_party/ideep/mkl-dnn': checked out '1137e04ec0b5251ca2b4400a4fd3c667ce843d67' 2024-08-08T20:55:32.3526030Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2024-08-08T20:55:32.4517800Z Submodule path 'third_party/kineto': checked out 'da2f2682cabaf95d601fa2a9b7e0979f84fe7667' 2024-08-08T20:55:32.4537359Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T20:55:32.4540464Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T20:55:32.4544017Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T20:55:32.4576370Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2024-08-08T20:55:32.9816609Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2024-08-08T20:55:34.1301881Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2024-08-08T20:55:35.2992305Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2024-08-08T20:55:35.3013819Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T20:55:35.3017498Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T20:55:35.3021159Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T20:55:35.3024836Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T20:55:35.3028538Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T20:55:35.3032871Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T20:55:35.3037026Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T20:55:35.3040845Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T20:55:35.3072757Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2024-08-08T20:55:36.1625736Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2024-08-08T20:55:36.5693565Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2024-08-08T20:55:37.7531352Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2024-08-08T20:55:38.0350726Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2024-08-08T20:55:38.5867074Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2024-08-08T20:55:39.7001740Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2024-08-08T20:55:47.0215765Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2024-08-08T20:55:47.4438239Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2024-08-08T20:55:47.4681942Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2024-08-08T20:55:47.5118546Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2024-08-08T20:55:47.5287721Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2024-08-08T20:55:47.5307483Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T20:55:47.5340411Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2024-08-08T20:55:47.8257093Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2024-08-08T20:55:47.8483893Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2024-08-08T20:55:47.8969279Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2024-08-08T20:55:48.0169010Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2024-08-08T20:55:48.0364894Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2024-08-08T20:55:48.0815472Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2024-08-08T20:55:48.1479803Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2024-08-08T20:55:48.1934996Z Submodule path 'third_party/mimalloc': checked out 'b66e3214d8a104669c2ec05ae91ebc26a8f5ab78' 2024-08-08T20:55:48.2233363Z Submodule path 'third_party/nccl/nccl': checked out 'ab2b89c4c339bd7f816fbc114a4b05d386b66290' 2024-08-08T20:55:48.3454858Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2024-08-08T20:55:48.8403936Z Submodule path 'third_party/onnx': checked out '3bf92c03a9f27eba3bda1e5b9e63ea20ec213557' 2024-08-08T20:55:48.8439908Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark' 2024-08-08T20:55:48.8443217Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2024-08-08T20:55:48.8475875Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/benchmark'... 2024-08-08T20:55:49.3818400Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2024-08-08T20:55:50.2473506Z Submodule path 'third_party/onnx/third_party/benchmark': checked out '2dd015dfef425c866d9a43f2c67d8b52d709acb6' 2024-08-08T20:55:50.2877893Z Submodule path 'third_party/onnx/third_party/pybind11': checked out '5b0a6fc2017fcc176545afe3e09c9f9885283242' 2024-08-08T20:55:50.3779095Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2024-08-08T20:55:50.3800827Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T20:55:50.3803948Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T20:55:50.3807457Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T20:55:50.3810932Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T20:55:50.3814745Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T20:55:50.3818724Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T20:55:50.3822300Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T20:55:50.3825924Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T20:55:50.3859284Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2024-08-08T20:55:50.8123573Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2024-08-08T20:55:51.8583185Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2024-08-08T20:55:52.1788846Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2024-08-08T20:55:58.2091831Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2024-08-08T20:55:58.4777402Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2024-08-08T20:55:58.6969398Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2024-08-08T20:55:59.0694362Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2024-08-08T20:56:06.6318967Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2024-08-08T20:56:06.6784490Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2024-08-08T20:56:06.6972476Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2024-08-08T20:56:06.8213102Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2024-08-08T20:56:06.8377318Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2024-08-08T20:56:06.8558709Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2024-08-08T20:56:06.8756408Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2024-08-08T20:56:06.8775557Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T20:56:06.8778804Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T20:56:06.8810566Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2024-08-08T20:56:08.7647932Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2024-08-08T20:56:10.1525816Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2024-08-08T20:56:10.2063620Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2024-08-08T20:56:10.8542330Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2024-08-08T20:56:10.8689890Z Submodule path 'third_party/pocketfft': checked out '9d3ab05a7fffbc71a492bc6a17be034e83e8f0fe' 2024-08-08T20:56:11.1890993Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2024-08-08T20:56:11.1914232Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2024-08-08T20:56:11.1918143Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2024-08-08T20:56:11.1950526Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2024-08-08T20:56:11.6867721Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2024-08-08T20:56:12.8085683Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2024-08-08T20:56:12.8903016Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2024-08-08T20:56:12.9024662Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2024-08-08T20:56:12.9179928Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2024-08-08T20:56:12.9624991Z Submodule path 'third_party/pybind11': checked out '941f45bcb51457884fa1afd6e24a67377d70f75c' 2024-08-08T20:56:12.9967407Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2024-08-08T20:56:13.0457873Z Submodule path 'third_party/sleef': checked out '60e76d2bce17d278b439d9da17177c8f957a9e9b' 2024-08-08T20:56:13.0801639Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2024-08-08T20:56:13.0821569Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2024-08-08T20:56:13.0824845Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2024-08-08T20:56:13.0827941Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2024-08-08T20:56:13.0831596Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T20:56:13.0862684Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2024-08-08T20:56:14.1662924Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2024-08-08T20:56:14.4003690Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2024-08-08T20:56:15.6471283Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2024-08-08T20:56:16.5814503Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2024-08-08T20:56:16.6007894Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2024-08-08T20:56:16.6790234Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2024-08-08T20:56:16.7140414Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2024-08-08T20:56:16.7162140Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T20:56:16.7192755Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2024-08-08T20:56:16.9715758Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2024-08-08T20:56:16.9764932Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2024-08-08T20:56:17.0133616Z Entering 'android/libs/fbjni' 2024-08-08T20:56:17.0185854Z Entering 'third_party/FP16' 2024-08-08T20:56:17.0239382Z Entering 'third_party/FXdiv' 2024-08-08T20:56:17.0292192Z Entering 'third_party/NNPACK' 2024-08-08T20:56:17.0346452Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-08T20:56:17.0399299Z Entering 'third_party/XNNPACK' 2024-08-08T20:56:17.0468415Z Entering 'third_party/benchmark' 2024-08-08T20:56:17.0520150Z Entering 'third_party/cpp-httplib' 2024-08-08T20:56:17.0572004Z Entering 'third_party/cpuinfo' 2024-08-08T20:56:17.0624464Z Entering 'third_party/cudnn_frontend' 2024-08-08T20:56:17.0676042Z Entering 'third_party/cutlass' 2024-08-08T20:56:17.0735715Z Entering 'third_party/eigen' 2024-08-08T20:56:17.0789261Z Entering 'third_party/fbgemm' 2024-08-08T20:56:17.0840926Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-08T20:56:17.0896487Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T20:56:17.0947083Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-08T20:56:17.1003693Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-08T20:56:17.1056930Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T20:56:17.1108472Z Entering 'third_party/flatbuffers' 2024-08-08T20:56:17.1163123Z Entering 'third_party/fmt' 2024-08-08T20:56:17.1214586Z Entering 'third_party/foxi' 2024-08-08T20:56:17.1266315Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-08T20:56:17.1318690Z Entering 'third_party/gloo' 2024-08-08T20:56:17.1370848Z Entering 'third_party/googletest' 2024-08-08T20:56:17.1425366Z Entering 'third_party/ideep' 2024-08-08T20:56:17.1475017Z Entering 'third_party/ideep/mkl-dnn' 2024-08-08T20:56:17.1533545Z Entering 'third_party/ittapi' 2024-08-08T20:56:17.1586304Z Entering 'third_party/kineto' 2024-08-08T20:56:17.1638473Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T20:56:17.1688154Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T20:56:17.1741870Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T20:56:17.1793050Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T20:56:17.1844398Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T20:56:17.1893102Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T20:56:17.1953522Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T20:56:17.2004724Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T20:56:17.2056261Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T20:56:17.2108029Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T20:56:17.2162363Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T20:56:17.2212197Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T20:56:17.2267233Z Entering 'third_party/mimalloc' 2024-08-08T20:56:17.2319951Z Entering 'third_party/nccl/nccl' 2024-08-08T20:56:17.2373161Z Entering 'third_party/nlohmann' 2024-08-08T20:56:17.2427494Z Entering 'third_party/onnx' 2024-08-08T20:56:17.2494837Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-08T20:56:17.2546960Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-08T20:56:17.2603765Z Entering 'third_party/opentelemetry-cpp' 2024-08-08T20:56:17.2660833Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T20:56:17.2717293Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T20:56:17.2768817Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T20:56:17.2818622Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T20:56:17.2871623Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T20:56:17.2922277Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T20:56:17.2971820Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T20:56:17.3020409Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T20:56:17.3072725Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T20:56:17.3126400Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T20:56:17.3200872Z Entering 'third_party/pocketfft' 2024-08-08T20:56:17.3254600Z Entering 'third_party/protobuf' 2024-08-08T20:56:17.3306641Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-08T20:56:17.3359202Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-08T20:56:17.3413193Z Entering 'third_party/psimd' 2024-08-08T20:56:17.3470267Z Entering 'third_party/pthreadpool' 2024-08-08T20:56:17.3524634Z Entering 'third_party/pybind11' 2024-08-08T20:56:17.3577321Z Entering 'third_party/python-peachpy' 2024-08-08T20:56:17.3630326Z Entering 'third_party/sleef' 2024-08-08T20:56:17.3683362Z Entering 'third_party/tensorpipe' 2024-08-08T20:56:17.3736018Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-08T20:56:17.3786325Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-08T20:56:17.3836805Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-08T20:56:17.3886743Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T20:56:17.3936158Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T20:56:17.4013619Z ##[endgroup] 2024-08-08T20:56:17.4016206Z ##[group]Persisting credentials for submodules 2024-08-08T20:56:17.4018290Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2024-08-08T20:56:17.4388467Z Entering 'android/libs/fbjni' 2024-08-08T20:56:17.4457749Z Entering 'third_party/FP16' 2024-08-08T20:56:17.4527236Z Entering 'third_party/FXdiv' 2024-08-08T20:56:17.4594208Z Entering 'third_party/NNPACK' 2024-08-08T20:56:17.4661569Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-08T20:56:17.4729379Z Entering 'third_party/XNNPACK' 2024-08-08T20:56:17.4812367Z Entering 'third_party/benchmark' 2024-08-08T20:56:17.4879829Z Entering 'third_party/cpp-httplib' 2024-08-08T20:56:17.4947742Z Entering 'third_party/cpuinfo' 2024-08-08T20:56:17.5014959Z Entering 'third_party/cudnn_frontend' 2024-08-08T20:56:17.5084994Z Entering 'third_party/cutlass' 2024-08-08T20:56:17.5159606Z Entering 'third_party/eigen' 2024-08-08T20:56:17.5230753Z Entering 'third_party/fbgemm' 2024-08-08T20:56:17.5301166Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-08T20:56:17.5368766Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T20:56:17.5434645Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-08T20:56:17.5507899Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-08T20:56:17.5574188Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T20:56:17.5642666Z Entering 'third_party/flatbuffers' 2024-08-08T20:56:17.5717311Z Entering 'third_party/fmt' 2024-08-08T20:56:17.5784757Z Entering 'third_party/foxi' 2024-08-08T20:56:17.5858478Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-08T20:56:17.5928364Z Entering 'third_party/gloo' 2024-08-08T20:56:17.5995432Z Entering 'third_party/googletest' 2024-08-08T20:56:17.6062833Z Entering 'third_party/ideep' 2024-08-08T20:56:17.6127946Z Entering 'third_party/ideep/mkl-dnn' 2024-08-08T20:56:17.6203423Z Entering 'third_party/ittapi' 2024-08-08T20:56:17.6271592Z Entering 'third_party/kineto' 2024-08-08T20:56:17.6343764Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T20:56:17.6409671Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T20:56:17.6478212Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T20:56:17.6545392Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T20:56:17.6611445Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T20:56:17.6683379Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T20:56:17.6755930Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T20:56:17.6824909Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T20:56:17.6896456Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T20:56:17.6964415Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T20:56:17.7034238Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T20:56:17.7101502Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T20:56:17.7173288Z Entering 'third_party/mimalloc' 2024-08-08T20:56:17.7241040Z Entering 'third_party/nccl/nccl' 2024-08-08T20:56:17.7307618Z Entering 'third_party/nlohmann' 2024-08-08T20:56:17.7379542Z Entering 'third_party/onnx' 2024-08-08T20:56:17.7469383Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-08T20:56:17.7537406Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-08T20:56:17.7609270Z Entering 'third_party/opentelemetry-cpp' 2024-08-08T20:56:17.7677158Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T20:56:17.7742557Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T20:56:17.7807817Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T20:56:17.7873010Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T20:56:17.7939858Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T20:56:17.8005413Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T20:56:17.8071713Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T20:56:17.8135171Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T20:56:17.8202924Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T20:56:17.8272672Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T20:56:17.8359251Z Entering 'third_party/pocketfft' 2024-08-08T20:56:17.8425837Z Entering 'third_party/protobuf' 2024-08-08T20:56:17.8495253Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-08T20:56:17.8562889Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-08T20:56:17.8631990Z Entering 'third_party/psimd' 2024-08-08T20:56:17.8702220Z Entering 'third_party/pthreadpool' 2024-08-08T20:56:17.8776692Z Entering 'third_party/pybind11' 2024-08-08T20:56:17.8843715Z Entering 'third_party/python-peachpy' 2024-08-08T20:56:17.8910346Z Entering 'third_party/sleef' 2024-08-08T20:56:17.8982785Z Entering 'third_party/tensorpipe' 2024-08-08T20:56:17.9052607Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-08T20:56:17.9119218Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-08T20:56:17.9183970Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-08T20:56:17.9250121Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T20:56:17.9312769Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T20:56:17.9411814Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2024-08-08T20:56:17.9781507Z Entering 'android/libs/fbjni' 2024-08-08T20:56:17.9846415Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2024-08-08T20:56:17.9867611Z Entering 'third_party/FP16' 2024-08-08T20:56:17.9927629Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2024-08-08T20:56:17.9948820Z Entering 'third_party/FXdiv' 2024-08-08T20:56:18.0010618Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2024-08-08T20:56:18.0031981Z Entering 'third_party/NNPACK' 2024-08-08T20:56:18.0099888Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2024-08-08T20:56:18.0121623Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-08T20:56:18.0187067Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2024-08-08T20:56:18.0208252Z Entering 'third_party/XNNPACK' 2024-08-08T20:56:18.0269967Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2024-08-08T20:56:18.0307049Z Entering 'third_party/benchmark' 2024-08-08T20:56:18.0369656Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2024-08-08T20:56:18.0391512Z Entering 'third_party/cpp-httplib' 2024-08-08T20:56:18.0458000Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2024-08-08T20:56:18.0479381Z Entering 'third_party/cpuinfo' 2024-08-08T20:56:18.0539974Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2024-08-08T20:56:18.0561644Z Entering 'third_party/cudnn_frontend' 2024-08-08T20:56:18.0623724Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2024-08-08T20:56:18.0645272Z Entering 'third_party/cutlass' 2024-08-08T20:56:18.0709951Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2024-08-08T20:56:18.0740661Z Entering 'third_party/eigen' 2024-08-08T20:56:18.0802011Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2024-08-08T20:56:18.0826126Z Entering 'third_party/fbgemm' 2024-08-08T20:56:18.0887094Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2024-08-08T20:56:18.0907280Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-08T20:56:18.0969390Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2024-08-08T20:56:18.0990020Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T20:56:18.1051719Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2024-08-08T20:56:18.1073345Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-08T20:56:18.1134399Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cutlass/config remote.origin.url 2024-08-08T20:56:18.1161615Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-08T20:56:18.1222792Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2024-08-08T20:56:18.1242438Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T20:56:18.1304109Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2024-08-08T20:56:18.1328210Z Entering 'third_party/flatbuffers' 2024-08-08T20:56:18.1389402Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2024-08-08T20:56:18.1413553Z Entering 'third_party/fmt' 2024-08-08T20:56:18.1474751Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2024-08-08T20:56:18.1495754Z Entering 'third_party/foxi' 2024-08-08T20:56:18.1557473Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/foxi/config remote.origin.url 2024-08-08T20:56:18.1578838Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-08T20:56:18.1639655Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2024-08-08T20:56:18.1661035Z Entering 'third_party/gloo' 2024-08-08T20:56:18.1723051Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2024-08-08T20:56:18.1744856Z Entering 'third_party/googletest' 2024-08-08T20:56:18.1804878Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2024-08-08T20:56:18.1826746Z Entering 'third_party/ideep' 2024-08-08T20:56:18.1888737Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2024-08-08T20:56:18.1907819Z Entering 'third_party/ideep/mkl-dnn' 2024-08-08T20:56:18.1968145Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2024-08-08T20:56:18.1996578Z Entering 'third_party/ittapi' 2024-08-08T20:56:18.2056813Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2024-08-08T20:56:18.2078719Z Entering 'third_party/kineto' 2024-08-08T20:56:18.2140675Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2024-08-08T20:56:18.2160615Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T20:56:18.2222016Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2024-08-08T20:56:18.2240710Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T20:56:18.2304619Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2024-08-08T20:56:18.2327249Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T20:56:18.2389202Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2024-08-08T20:56:18.2409368Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T20:56:18.2470843Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2024-08-08T20:56:18.2496232Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T20:56:18.2559362Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2024-08-08T20:56:18.2577980Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T20:56:18.2643890Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2024-08-08T20:56:18.2667595Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T20:56:18.2730653Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2024-08-08T20:56:18.2750801Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T20:56:18.2813327Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2024-08-08T20:56:18.2835218Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T20:56:18.2901002Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2024-08-08T20:56:18.2924669Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T20:56:18.2989476Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2024-08-08T20:56:18.3015461Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T20:56:18.3078512Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2024-08-08T20:56:18.3100431Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T20:56:18.3162865Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2024-08-08T20:56:18.3188289Z Entering 'third_party/mimalloc' 2024-08-08T20:56:18.3256299Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2024-08-08T20:56:18.3279570Z Entering 'third_party/nccl/nccl' 2024-08-08T20:56:18.3345598Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2024-08-08T20:56:18.3368360Z Entering 'third_party/nlohmann' 2024-08-08T20:56:18.3428787Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2024-08-08T20:56:18.3451970Z Entering 'third_party/onnx' 2024-08-08T20:56:18.3523344Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2024-08-08T20:56:18.3558698Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-08T20:56:18.3624940Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2024-08-08T20:56:18.3646147Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-08T20:56:18.3709433Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2024-08-08T20:56:18.3736227Z Entering 'third_party/opentelemetry-cpp' 2024-08-08T20:56:18.3801108Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2024-08-08T20:56:18.3824190Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T20:56:18.3885669Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2024-08-08T20:56:18.3905758Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T20:56:18.3968749Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2024-08-08T20:56:18.3989478Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T20:56:18.4051300Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2024-08-08T20:56:18.4072025Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T20:56:18.4133004Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2024-08-08T20:56:18.4155434Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T20:56:18.4216749Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2024-08-08T20:56:18.4237353Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T20:56:18.4299404Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2024-08-08T20:56:18.4320836Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T20:56:18.4386521Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2024-08-08T20:56:18.4407132Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T20:56:18.4470535Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2024-08-08T20:56:18.4493747Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T20:56:18.4557687Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2024-08-08T20:56:18.4586492Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T20:56:18.4648091Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2024-08-08T20:56:18.4690475Z Entering 'third_party/pocketfft' 2024-08-08T20:56:18.4752415Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2024-08-08T20:56:18.4774441Z Entering 'third_party/protobuf' 2024-08-08T20:56:18.4838737Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2024-08-08T20:56:18.4862911Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-08T20:56:18.4924883Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2024-08-08T20:56:18.4945538Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-08T20:56:18.5007538Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2024-08-08T20:56:18.5033474Z Entering 'third_party/psimd' 2024-08-08T20:56:18.5104366Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2024-08-08T20:56:18.5129283Z Entering 'third_party/pthreadpool' 2024-08-08T20:56:18.5194272Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2024-08-08T20:56:18.5218703Z Entering 'third_party/pybind11' 2024-08-08T20:56:18.5279633Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2024-08-08T20:56:18.5301136Z Entering 'third_party/python-peachpy' 2024-08-08T20:56:18.5365293Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2024-08-08T20:56:18.5386254Z Entering 'third_party/sleef' 2024-08-08T20:56:18.5448634Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2024-08-08T20:56:18.5471183Z Entering 'third_party/tensorpipe' 2024-08-08T20:56:18.5536803Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2024-08-08T20:56:18.5557317Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-08T20:56:18.5624167Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2024-08-08T20:56:18.5645881Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-08T20:56:18.5708871Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2024-08-08T20:56:18.5730618Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-08T20:56:18.5796365Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2024-08-08T20:56:18.5817675Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T20:56:18.5879971Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2024-08-08T20:56:18.5899201Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T20:56:18.5964830Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2024-08-08T20:56:18.7534966Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2024-08-08T20:56:18.7913050Z Entering 'android/libs/fbjni' 2024-08-08T20:56:18.7964883Z Entering 'third_party/FP16' 2024-08-08T20:56:18.8021135Z Entering 'third_party/FXdiv' 2024-08-08T20:56:18.8077036Z Entering 'third_party/NNPACK' 2024-08-08T20:56:18.8131119Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-08T20:56:18.8185265Z Entering 'third_party/XNNPACK' 2024-08-08T20:56:18.8253900Z Entering 'third_party/benchmark' 2024-08-08T20:56:18.8306302Z Entering 'third_party/cpp-httplib' 2024-08-08T20:56:18.8359775Z Entering 'third_party/cpuinfo' 2024-08-08T20:56:18.8412409Z Entering 'third_party/cudnn_frontend' 2024-08-08T20:56:18.8470442Z Entering 'third_party/cutlass' 2024-08-08T20:56:18.8532694Z Entering 'third_party/eigen' 2024-08-08T20:56:18.8588562Z Entering 'third_party/fbgemm' 2024-08-08T20:56:18.8641393Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-08T20:56:18.8691677Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T20:56:18.8744161Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-08T20:56:18.8802677Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-08T20:56:18.8856336Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T20:56:18.8909025Z Entering 'third_party/flatbuffers' 2024-08-08T20:56:18.8967459Z Entering 'third_party/fmt' 2024-08-08T20:56:18.9022567Z Entering 'third_party/foxi' 2024-08-08T20:56:18.9076488Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-08T20:56:18.9130819Z Entering 'third_party/gloo' 2024-08-08T20:56:18.9185046Z Entering 'third_party/googletest' 2024-08-08T20:56:18.9238781Z Entering 'third_party/ideep' 2024-08-08T20:56:18.9290167Z Entering 'third_party/ideep/mkl-dnn' 2024-08-08T20:56:18.9355187Z Entering 'third_party/ittapi' 2024-08-08T20:56:18.9409390Z Entering 'third_party/kineto' 2024-08-08T20:56:18.9461152Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T20:56:18.9513203Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T20:56:18.9568358Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T20:56:18.9621820Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T20:56:18.9676452Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T20:56:18.9729446Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T20:56:18.9784954Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T20:56:18.9839253Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T20:56:18.9892604Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T20:56:18.9949004Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T20:56:19.0004920Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T20:56:19.0060209Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T20:56:19.0118634Z Entering 'third_party/mimalloc' 2024-08-08T20:56:19.0173363Z Entering 'third_party/nccl/nccl' 2024-08-08T20:56:19.0226748Z Entering 'third_party/nlohmann' 2024-08-08T20:56:19.0280236Z Entering 'third_party/onnx' 2024-08-08T20:56:19.0354445Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-08T20:56:19.0405878Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-08T20:56:19.0462313Z Entering 'third_party/opentelemetry-cpp' 2024-08-08T20:56:19.0519329Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T20:56:19.0568765Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T20:56:19.0620050Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T20:56:19.0675210Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T20:56:19.0729458Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T20:56:19.0779851Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T20:56:19.0831274Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T20:56:19.0885453Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T20:56:19.0945505Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T20:56:19.1002890Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T20:56:19.1081361Z Entering 'third_party/pocketfft' 2024-08-08T20:56:19.1138304Z Entering 'third_party/protobuf' 2024-08-08T20:56:19.1201829Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-08T20:56:19.1254640Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-08T20:56:19.1317321Z Entering 'third_party/psimd' 2024-08-08T20:56:19.1371447Z Entering 'third_party/pthreadpool' 2024-08-08T20:56:19.1426367Z Entering 'third_party/pybind11' 2024-08-08T20:56:19.1488567Z Entering 'third_party/python-peachpy' 2024-08-08T20:56:19.1543723Z Entering 'third_party/sleef' 2024-08-08T20:56:19.1598778Z Entering 'third_party/tensorpipe' 2024-08-08T20:56:19.1652532Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-08T20:56:19.1705680Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-08T20:56:19.1758431Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-08T20:56:19.1808733Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T20:56:19.1864950Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T20:56:19.1945168Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2024-08-08T20:56:19.2323139Z Entering 'android/libs/fbjni' 2024-08-08T20:56:19.2374525Z Entering 'third_party/FP16' 2024-08-08T20:56:19.2435873Z Entering 'third_party/FXdiv' 2024-08-08T20:56:19.2488097Z Entering 'third_party/NNPACK' 2024-08-08T20:56:19.2541197Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-08T20:56:19.2594322Z Entering 'third_party/XNNPACK' 2024-08-08T20:56:19.2664388Z Entering 'third_party/benchmark' 2024-08-08T20:56:19.2718743Z Entering 'third_party/cpp-httplib' 2024-08-08T20:56:19.2771961Z Entering 'third_party/cpuinfo' 2024-08-08T20:56:19.2825847Z Entering 'third_party/cudnn_frontend' 2024-08-08T20:56:19.2879057Z Entering 'third_party/cutlass' 2024-08-08T20:56:19.2940163Z Entering 'third_party/eigen' 2024-08-08T20:56:19.2995525Z Entering 'third_party/fbgemm' 2024-08-08T20:56:19.3048407Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-08T20:56:19.3100441Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T20:56:19.3156866Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-08T20:56:19.3214243Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-08T20:56:19.3268413Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T20:56:19.3321961Z Entering 'third_party/flatbuffers' 2024-08-08T20:56:19.3378295Z Entering 'third_party/fmt' 2024-08-08T20:56:19.3430844Z Entering 'third_party/foxi' 2024-08-08T20:56:19.3481925Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-08T20:56:19.3537371Z Entering 'third_party/gloo' 2024-08-08T20:56:19.3589496Z Entering 'third_party/googletest' 2024-08-08T20:56:19.3641251Z Entering 'third_party/ideep' 2024-08-08T20:56:19.3690538Z Entering 'third_party/ideep/mkl-dnn' 2024-08-08T20:56:19.3754942Z Entering 'third_party/ittapi' 2024-08-08T20:56:19.3805967Z Entering 'third_party/kineto' 2024-08-08T20:56:19.3857240Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T20:56:19.3907097Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T20:56:19.3958953Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T20:56:19.4008843Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T20:56:19.4059610Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T20:56:19.4107614Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T20:56:19.4162486Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T20:56:19.4212993Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T20:56:19.4266331Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T20:56:19.4319508Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T20:56:19.4372761Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T20:56:19.4424039Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T20:56:19.4478840Z Entering 'third_party/mimalloc' 2024-08-08T20:56:19.4531789Z Entering 'third_party/nccl/nccl' 2024-08-08T20:56:19.4584440Z Entering 'third_party/nlohmann' 2024-08-08T20:56:19.4640428Z Entering 'third_party/onnx' 2024-08-08T20:56:19.4707798Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-08T20:56:19.4758717Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-08T20:56:19.4815461Z Entering 'third_party/opentelemetry-cpp' 2024-08-08T20:56:19.4869277Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T20:56:19.4919593Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T20:56:19.4970728Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T20:56:19.5020859Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T20:56:19.5071714Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T20:56:19.5126687Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T20:56:19.5176893Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T20:56:19.5225277Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T20:56:19.5278171Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T20:56:19.5332443Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T20:56:19.5404309Z Entering 'third_party/pocketfft' 2024-08-08T20:56:19.5457740Z Entering 'third_party/protobuf' 2024-08-08T20:56:19.5511480Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-08T20:56:19.5562664Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-08T20:56:19.5616782Z Entering 'third_party/psimd' 2024-08-08T20:56:19.5668806Z Entering 'third_party/pthreadpool' 2024-08-08T20:56:19.5721249Z Entering 'third_party/pybind11' 2024-08-08T20:56:19.5773306Z Entering 'third_party/python-peachpy' 2024-08-08T20:56:19.5825204Z Entering 'third_party/sleef' 2024-08-08T20:56:19.5876705Z Entering 'third_party/tensorpipe' 2024-08-08T20:56:19.5927966Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-08T20:56:19.5979792Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-08T20:56:19.6029926Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-08T20:56:19.6079712Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T20:56:19.6129273Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T20:56:19.6206114Z ##[endgroup] 2024-08-08T20:56:19.6258989Z [command]/usr/bin/git log -1 --format='%H' 2024-08-08T20:56:19.6295864Z 'b9d86fa89636e301796d4201f36d86c73f6e49bc' 2024-08-08T20:56:19.6501364Z Prepare all required actions 2024-08-08T20:56:19.6501820Z Getting action download info 2024-08-08T20:56:19.8064415Z ##[group]Run ./.github/actions/setup-linux 2024-08-08T20:56:19.8064846Z env: 2024-08-08T20:56:19.8065125Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:19.8065462Z ##[endgroup] 2024-08-08T20:56:19.8126921Z ##[group]Run set -euo pipefail 2024-08-08T20:56:19.8127364Z set -euo pipefail 2024-08-08T20:56:19.8127749Z function get_ec2_metadata() { 2024-08-08T20:56:19.8128269Z  # Pulled from instance metadata endpoint for EC2 2024-08-08T20:56:19.8129158Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2024-08-08T20:56:19.8129917Z  category=$1 2024-08-08T20:56:19.8130433Z  # If it is GCP runner (runner name contains gcp), do not run this 2024-08-08T20:56:19.8131060Z  runner_name_str=i-09ba204a38896644d 2024-08-08T20:56:19.8131546Z  if [[ -f /.inarc ]]; then 2024-08-08T20:56:19.8132038Z  echo "ARC Runner, no info on ec2 metadata" 2024-08-08T20:56:19.8132594Z  elif [[ $runner_name_str == *"gcp"* ]]; then 2024-08-08T20:56:19.8133270Z  echo "Runner is from Google Cloud Platform, No info on ec2 metadata" 2024-08-08T20:56:19.8133862Z  else 2024-08-08T20:56:19.8134350Z  curl -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2024-08-08T20:56:19.8134918Z  fi 2024-08-08T20:56:19.8135198Z } 2024-08-08T20:56:19.8135562Z echo "ami-id: $(get_ec2_metadata ami-id)" 2024-08-08T20:56:19.8136136Z echo "instance-id: $(get_ec2_metadata instance-id)" 2024-08-08T20:56:19.8136775Z echo "instance-type: $(get_ec2_metadata instance-type)" 2024-08-08T20:56:19.8137336Z echo "system info $(uname -a)" 2024-08-08T20:56:19.8147126Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:19.8147610Z env: 2024-08-08T20:56:19.8147893Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:19.8148243Z ##[endgroup] 2024-08-08T20:56:19.8247747Z ami-id: ami-06c68f701d8090592 2024-08-08T20:56:19.8308151Z instance-id: i-09ba204a38896644d 2024-08-08T20:56:19.8365194Z instance-type: g5.4xlarge 2024-08-08T20:56:19.8378095Z system info Linux ip-10-0-41-199.ec2.internal 6.1.94-99.176.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Jun 18 14:57:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux 2024-08-08T20:56:19.8403731Z ##[group]Run echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> $GITHUB_OUTPUT 2024-08-08T20:56:19.8404659Z echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> $GITHUB_OUTPUT 2024-08-08T20:56:19.8413935Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:19.8414425Z env: 2024-08-08T20:56:19.8414692Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:19.8415029Z ##[endgroup] 2024-08-08T20:56:19.8493139Z ##[group]Run if systemctl is-active --quiet docker; then 2024-08-08T20:56:19.8493735Z if systemctl is-active --quiet docker; then 2024-08-08T20:56:19.8494263Z  echo "Docker daemon is running..."; 2024-08-08T20:56:19.8494706Z else 2024-08-08T20:56:19.8495184Z  echo "Starting docker deamon..." && sudo systemctl start docker; 2024-08-08T20:56:19.8495750Z fi 2024-08-08T20:56:19.8504426Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:19.8504910Z env: 2024-08-08T20:56:19.8505190Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:19.8505720Z ##[endgroup] 2024-08-08T20:56:19.8590500Z Docker daemon is running... 2024-08-08T20:56:19.8639627Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2024-08-08T20:56:19.8640182Z with: 2024-08-08T20:56:19.8640446Z shell: bash 2024-08-08T20:56:19.8640735Z timeout_minutes: 5 2024-08-08T20:56:19.8641053Z max_attempts: 3 2024-08-08T20:56:19.8641366Z retry_wait_seconds: 30 2024-08-08T20:56:19.8642920Z command: AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" 2024-08-08T20:56:19.8644339Z polling_interval_seconds: 1 2024-08-08T20:56:19.8644717Z warning_on_retry: true 2024-08-08T20:56:19.8645065Z continue_on_error: false 2024-08-08T20:56:19.8645390Z env: 2024-08-08T20:56:19.8645663Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:19.8646015Z AWS_RETRY_MODE: standard 2024-08-08T20:56:19.8646366Z AWS_MAX_ATTEMPTS: 5 2024-08-08T20:56:19.8646703Z AWS_DEFAULT_REGION: us-east-1 2024-08-08T20:56:19.8647066Z ##[endgroup] 2024-08-08T20:56:21.0340486Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2024-08-08T20:56:21.0341355Z Configure a credential helper to remove this warning. See 2024-08-08T20:56:21.0342987Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-08-08T20:56:21.0343620Z 2024-08-08T20:56:21.0343805Z Login Succeeded 2024-08-08T20:56:21.9203821Z Command completed after 1 attempt(s). 2024-08-08T20:56:21.9267362Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-08-08T20:56:21.9268074Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-08-08T20:56:21.9268720Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2024-08-08T20:56:21.9277984Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:21.9278473Z env: 2024-08-08T20:56:21.9278761Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:21.9279105Z ##[endgroup] 2024-08-08T20:56:21.9375005Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2024-08-08T20:56:21.9375761Z # ignore expansion of "docker ps -q" since it could be empty 2024-08-08T20:56:21.9376341Z # shellcheck disable=SC2046 2024-08-08T20:56:21.9376797Z docker stop $(docker ps -q) || true 2024-08-08T20:56:21.9377275Z # Prune all of the docker images 2024-08-08T20:56:21.9377717Z docker system prune -af 2024-08-08T20:56:21.9386020Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:21.9386504Z env: 2024-08-08T20:56:21.9386778Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:21.9387115Z ##[endgroup] 2024-08-08T20:56:21.9705457Z "docker stop" requires at least 1 argument. 2024-08-08T20:56:21.9706347Z See 'docker stop --help'. 2024-08-08T20:56:21.9706667Z 2024-08-08T20:56:21.9706955Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2024-08-08T20:56:21.9707446Z 2024-08-08T20:56:21.9707638Z Stop one or more running containers 2024-08-08T20:56:21.9896082Z Total reclaimed space: 0B 2024-08-08T20:56:21.9941645Z ##[group]Run set +e 2024-08-08T20:56:21.9942002Z set +e 2024-08-08T20:56:21.9942290Z set -x 2024-08-08T20:56:21.9942580Z  2024-08-08T20:56:21.9942895Z PT_DOMAIN=download.pytorch.org 2024-08-08T20:56:21.9943671Z # TODO: Flaky access to download.pytorch.org https://github.com/pytorch/pytorch/issues/100400, 2024-08-08T20:56:21.9944725Z # cleaning this up once the issue is fixed. There are more than one resolved IP here, the last 2024-08-08T20:56:21.9945460Z # one is returned at random 2024-08-08T20:56:21.9945988Z RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" | tail -n1) 2024-08-08T20:56:21.9946490Z  2024-08-08T20:56:21.9946802Z if [ -z "${RESOLVED_IP}" ]; then 2024-08-08T20:56:21.9947409Z  echo "Couldn't resolve ${PT_DOMAIN}, retrying with Google DNS..." 2024-08-08T20:56:21.9948345Z  RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" @8.8.8.8 | tail -n1) 2024-08-08T20:56:21.9948893Z  2024-08-08T20:56:21.9949215Z  if [ -z "${RESOLVED_IP}" ]; then 2024-08-08T20:56:21.9949751Z  echo "Couldn't resolve ${PT_DOMAIN}, exiting..." 2024-08-08T20:56:21.9950248Z  exit 1 2024-08-08T20:56:21.9950559Z  fi 2024-08-08T20:56:21.9951006Z fi 2024-08-08T20:56:21.9951283Z  2024-08-08T20:56:21.9951636Z if grep -r "${PT_DOMAIN}" /etc/hosts; then 2024-08-08T20:56:21.9952282Z  # Clean up any old records first 2024-08-08T20:56:21.9952781Z  sudo sed -i "/${PT_DOMAIN}/d" /etc/hosts 2024-08-08T20:56:21.9953225Z fi 2024-08-08T20:56:21.9953492Z  2024-08-08T20:56:21.9953919Z echo "${RESOLVED_IP} ${PT_DOMAIN}" | sudo tee -a /etc/hosts 2024-08-08T20:56:21.9954457Z cat /etc/hosts 2024-08-08T20:56:21.9963396Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:21.9963891Z env: 2024-08-08T20:56:21.9964169Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:21.9964504Z ##[endgroup] 2024-08-08T20:56:21.9993239Z + PT_DOMAIN=download.pytorch.org 2024-08-08T20:56:21.9999289Z ++ dig -4 +short download.pytorch.org 2024-08-08T20:56:21.9999863Z ++ tail -n1 2024-08-08T20:56:22.0266175Z + RESOLVED_IP=18.160.10.22 2024-08-08T20:56:22.0266709Z + '[' -z 18.160.10.22 ']' 2024-08-08T20:56:22.0267153Z + grep -r download.pytorch.org /etc/hosts 2024-08-08T20:56:22.0279445Z 18.160.10.22 download.pytorch.org 2024-08-08T20:56:22.0281785Z + sudo sed -i /download.pytorch.org/d /etc/hosts 2024-08-08T20:56:22.1419977Z + echo '18.160.10.22 download.pytorch.org' 2024-08-08T20:56:22.1420610Z + sudo tee -a /etc/hosts 2024-08-08T20:56:22.1745320Z 18.160.10.22 download.pytorch.org 2024-08-08T20:56:22.1767014Z + cat /etc/hosts 2024-08-08T20:56:22.1778188Z 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 2024-08-08T20:56:22.1790455Z ::1 localhost6 localhost6.localdomain6 2024-08-08T20:56:22.1790945Z 18.160.10.22 download.pytorch.org 2024-08-08T20:56:22.1942140Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2024-08-08T20:56:22.1942743Z with: 2024-08-08T20:56:22.1943690Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.1944772Z docker-build-dir: .ci/docker 2024-08-08T20:56:22.1945149Z working-directory: . 2024-08-08T20:56:22.1945607Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:22.1946132Z force-push: false 2024-08-08T20:56:22.1946435Z env: 2024-08-08T20:56:22.1946703Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:22.1947040Z ##[endgroup] 2024-08-08T20:56:22.1965539Z ##[group]Run set -ex 2024-08-08T20:56:22.1965885Z set -ex 2024-08-08T20:56:22.1966176Z  2024-08-08T20:56:22.1966721Z # If the docker build directory or the build script doesn't exist, the action will 2024-08-08T20:56:22.1967671Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2024-08-08T20:56:22.1968452Z # job could then download the pre-built image as usual 2024-08-08T20:56:22.1969176Z if [[ ! -d "${DOCKER_BUILD_DIR}" ]] || [[ ! -f "${DOCKER_BUILD_DIR}/build.sh" ]]; then 2024-08-08T20:56:22.1969838Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.1970457Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.1970998Z  2024-08-08T20:56:22.1971501Z  echo "There is no Docker build script in ${REPO_NAME} repo, skipping..." 2024-08-08T20:56:22.1972112Z  exit 0 2024-08-08T20:56:22.1972404Z else 2024-08-08T20:56:22.1972765Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.1973204Z fi 2024-08-08T20:56:22.1973657Z  2024-08-08T20:56:22.1974115Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2024-08-08T20:56:22.1975000Z  # The docker image name already includes the ECR prefix and tag, so we can just 2024-08-08T20:56:22.1975759Z  # use it as it is, but first let's extract the tag 2024-08-08T20:56:22.1976450Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2024-08-08T20:56:22.1977155Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.1977835Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.1978378Z else 2024-08-08T20:56:22.1978804Z  DOCKER_TAG=$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2024-08-08T20:56:22.1979457Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.1980327Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.1981071Z fi 2024-08-08T20:56:22.1989819Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:22.1990298Z env: 2024-08-08T20:56:22.1990570Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:22.1990919Z REPO_NAME: pytorch 2024-08-08T20:56:22.1991891Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.1993057Z DOCKER_BUILD_DIR: .ci/docker 2024-08-08T20:56:22.1993544Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:22.1994049Z ##[endgroup] 2024-08-08T20:56:22.2023411Z + [[ ! -d .ci/docker ]] 2024-08-08T20:56:22.2023833Z + [[ ! -f .ci/docker/build.sh ]] 2024-08-08T20:56:22.2024214Z + echo skip=false 2024-08-08T20:56:22.2025869Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2024-08-08T20:56:22.2031498Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.2032731Z ++ awk -F '[:,]' '{print $2}' 2024-08-08T20:56:22.2057921Z + DOCKER_TAG=02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.2058616Z + echo docker-tag=02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.2059918Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.2087624Z ##[group]Run set +e 2024-08-08T20:56:22.2088082Z set +e 2024-08-08T20:56:22.2088598Z set -x 2024-08-08T20:56:22.2089008Z  2024-08-08T20:56:22.2089379Z login() { 2024-08-08T20:56:22.2090163Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2024-08-08T20:56:22.2090971Z } 2024-08-08T20:56:22.2091342Z  2024-08-08T20:56:22.2091761Z retry () { 2024-08-08T20:56:22.2092242Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2024-08-08T20:56:22.2092780Z } 2024-08-08T20:56:22.2093188Z  2024-08-08T20:56:22.2093595Z retry login "${DOCKER_REGISTRY}" 2024-08-08T20:56:22.2094100Z  2024-08-08T20:56:22.2094695Z # Check if image already exists, if it does then skip building it 2024-08-08T20:56:22.2095479Z if docker manifest inspect "${DOCKER_IMAGE}"; then 2024-08-08T20:56:22.2096087Z  exit 0 2024-08-08T20:56:22.2096493Z fi 2024-08-08T20:56:22.2096862Z  2024-08-08T20:56:22.2097469Z # NB: This part requires a full checkout. Otherwise, the merge base will 2024-08-08T20:56:22.2098377Z # be empty. The default action would be to continue rebuild the image 2024-08-08T20:56:22.2099376Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2024-08-08T20:56:22.2100222Z  # if we're on the base branch then use the parent commit 2024-08-08T20:56:22.2100958Z  MERGE_BASE=$(git rev-parse HEAD~) 2024-08-08T20:56:22.2101464Z else 2024-08-08T20:56:22.2102022Z  # otherwise we're on a PR, so use the most recent base commit 2024-08-08T20:56:22.2102848Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2024-08-08T20:56:22.2103428Z fi 2024-08-08T20:56:22.2103802Z  2024-08-08T20:56:22.2104272Z if [[ -z "${MERGE_BASE}" ]]; then 2024-08-08T20:56:22.2104852Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.2105395Z  2024-08-08T20:56:22.2106211Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2024-08-08T20:56:22.2107015Z  exit 0 2024-08-08T20:56:22.2107408Z fi 2024-08-08T20:56:22.2107858Z  2024-08-08T20:56:22.2108353Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2024-08-08T20:56:22.2109405Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2024-08-08T20:56:22.2110354Z  exit 1 2024-08-08T20:56:22.2110746Z fi 2024-08-08T20:56:22.2111078Z  2024-08-08T20:56:22.2111726Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2024-08-08T20:56:22.2112895Z # If no image exists but the hash is the same as the previous hash then we should error out here 2024-08-08T20:56:22.2113802Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2024-08-08T20:56:22.2114968Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2024-08-08T20:56:22.2116489Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2024-08-08T20:56:22.2117310Z fi 2024-08-08T20:56:22.2117640Z  2024-08-08T20:56:22.2118275Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2024-08-08T20:56:22.2127033Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:22.2127579Z env: 2024-08-08T20:56:22.2128031Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:22.2128495Z DOCKER_BUILD_DIR: .ci/docker 2024-08-08T20:56:22.2129031Z BASE_REVISION: b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T20:56:22.2130296Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.2131509Z DOCKER_TAG: 02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.2132187Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:22.2132844Z ##[endgroup] 2024-08-08T20:56:22.2163252Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:22.2164298Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:22.2165635Z + aws ecr get-login-password --region us-east-1 2024-08-08T20:56:22.2168113Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:22.7676055Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2024-08-08T20:56:22.7677065Z Configure a credential helper to remove this warning. See 2024-08-08T20:56:22.7678308Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-08-08T20:56:22.7678854Z 2024-08-08T20:56:22.7679049Z Login Succeeded 2024-08-08T20:56:22.7700665Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:22.9792828Z { 2024-08-08T20:56:22.9793394Z "schemaVersion": 2, 2024-08-08T20:56:22.9794248Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2024-08-08T20:56:22.9795825Z "config": { 2024-08-08T20:56:22.9796926Z "mediaType": "application/vnd.docker.container.image.v1+json", 2024-08-08T20:56:22.9798331Z "size": 49379, 2024-08-08T20:56:22.9799149Z "digest": "sha256:cd235e6b1be7a0d3099ad6f7c05391c21669cd859ddf645fc846603ac8b56d9d" 2024-08-08T20:56:22.9800285Z }, 2024-08-08T20:56:22.9800935Z "layers": [ 2024-08-08T20:56:22.9801619Z { 2024-08-08T20:56:22.9802335Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9803187Z "size": 28584317, 2024-08-08T20:56:22.9804041Z "digest": "sha256:63e9bbe323274e77e58d77c6ab6802d247458f784222fbb07a2556d6ec74ee05" 2024-08-08T20:56:22.9804997Z }, 2024-08-08T20:56:22.9805559Z { 2024-08-08T20:56:22.9806212Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9807186Z "size": 7944771, 2024-08-08T20:56:22.9808051Z "digest": "sha256:cfb3d849840ee60cee7b02bad68c1fc3c15928ebcd88f327754766b670578ed6" 2024-08-08T20:56:22.9810036Z }, 2024-08-08T20:56:22.9810503Z { 2024-08-08T20:56:22.9811211Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9811821Z "size": 57593718, 2024-08-08T20:56:22.9812444Z "digest": "sha256:968831e596a6288f0fed9b8612ee4ee8e75511037c4305058805492c5162e481" 2024-08-08T20:56:22.9813190Z }, 2024-08-08T20:56:22.9813488Z { 2024-08-08T20:56:22.9814028Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9814776Z "size": 187, 2024-08-08T20:56:22.9815861Z "digest": "sha256:ea310eb267ca1cab61b6b16f566cd28bfd59a741395a011f5e76716e15ba57c6" 2024-08-08T20:56:22.9816592Z }, 2024-08-08T20:56:22.9817016Z { 2024-08-08T20:56:22.9817498Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9818157Z "size": 6885, 2024-08-08T20:56:22.9818837Z "digest": "sha256:3af11d09e9cd1eb9c379f0a4071231e5a5642eb728b4b33bcb76be291f3c9488" 2024-08-08T20:56:22.9819495Z }, 2024-08-08T20:56:22.9819856Z { 2024-08-08T20:56:22.9820422Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9821313Z "size": 1361380219, 2024-08-08T20:56:22.9822035Z "digest": "sha256:ebfec18059b91e56882881ac34754f917861edb5f732c395d2a1a851bbd6db46" 2024-08-08T20:56:22.9822820Z }, 2024-08-08T20:56:22.9823143Z { 2024-08-08T20:56:22.9824299Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9824942Z "size": 62686, 2024-08-08T20:56:22.9825600Z "digest": "sha256:533b4aebf16914c763b7b0de3ce657590c6f979045e9fdf1f816adaf68d8a4d3" 2024-08-08T20:56:22.9826331Z }, 2024-08-08T20:56:22.9826667Z { 2024-08-08T20:56:22.9827204Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9827849Z "size": 1685, 2024-08-08T20:56:22.9828495Z "digest": "sha256:9dd75d06a0910f28cb1e484b8808724e5a6ee570ecb8fc04631368f546b39ed9" 2024-08-08T20:56:22.9829179Z }, 2024-08-08T20:56:22.9829529Z { 2024-08-08T20:56:22.9830105Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9830744Z "size": 1523, 2024-08-08T20:56:22.9831382Z "digest": "sha256:30bfca4dd3492d60ed8035b0eeb1229897041140db117f5663465a551e25851d" 2024-08-08T20:56:22.9832228Z }, 2024-08-08T20:56:22.9832567Z { 2024-08-08T20:56:22.9833097Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9833768Z "size": 2608186849, 2024-08-08T20:56:22.9834414Z "digest": "sha256:1b57ce94cad9a5097d99b8d6f7bcd51df5a162fc3c5f5686b689b76993724bc8" 2024-08-08T20:56:22.9835123Z }, 2024-08-08T20:56:22.9835487Z { 2024-08-08T20:56:22.9836015Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9836670Z "size": 86631, 2024-08-08T20:56:22.9837354Z "digest": "sha256:9ee6bdb31195dd42fd98147a75d540efe0c5708e0ecda866a0bca060084fddab" 2024-08-08T20:56:22.9838062Z }, 2024-08-08T20:56:22.9838443Z { 2024-08-08T20:56:22.9838966Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9839798Z "size": 1823, 2024-08-08T20:56:22.9840487Z "digest": "sha256:08be3f7f3cfb106016f7b52ce7babbc0fff221bbc8422880fd11f8045fb04e39" 2024-08-08T20:56:22.9841189Z }, 2024-08-08T20:56:22.9841543Z { 2024-08-08T20:56:22.9842071Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9842698Z "size": 245786667, 2024-08-08T20:56:22.9843349Z "digest": "sha256:fd57b6413183c2eafa95d909f3ee48a6d30511178a96333ee4923bb9edc28aa5" 2024-08-08T20:56:22.9844130Z }, 2024-08-08T20:56:22.9844452Z { 2024-08-08T20:56:22.9844989Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9845659Z "size": 545, 2024-08-08T20:56:22.9846291Z "digest": "sha256:9a1bc15482bfb03e613a31fef4f75591cdcd6158faabd8bbf3ddf3f37439737d" 2024-08-08T20:56:22.9847019Z }, 2024-08-08T20:56:22.9847368Z { 2024-08-08T20:56:22.9847889Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9848552Z "size": 1284, 2024-08-08T20:56:22.9849200Z "digest": "sha256:909b1bf15eccb3fd953943989fa974c79d52c18ac03cb592daa71daa0dc415cc" 2024-08-08T20:56:22.9849901Z }, 2024-08-08T20:56:22.9850256Z { 2024-08-08T20:56:22.9850812Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9851466Z "size": 484, 2024-08-08T20:56:22.9852083Z "digest": "sha256:88e7a278ede8a729ae3657c3c7a836a5eeb3229c5c36ab6723cd030b9e7abe85" 2024-08-08T20:56:22.9852792Z }, 2024-08-08T20:56:22.9853151Z { 2024-08-08T20:56:22.9853692Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9854366Z "size": 91713067, 2024-08-08T20:56:22.9855030Z "digest": "sha256:e5d7856b663f150b51762230d693bb44ae2bcbb09a40431bc6dcbf73f9f3e871" 2024-08-08T20:56:22.9855729Z }, 2024-08-08T20:56:22.9856071Z { 2024-08-08T20:56:22.9856604Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9857242Z "size": 3229, 2024-08-08T20:56:22.9857913Z "digest": "sha256:00a0e1fafdad8e9d907d3fc7f0231a49e06158f37329ebf99bc2fd47c70866df" 2024-08-08T20:56:22.9858748Z }, 2024-08-08T20:56:22.9859090Z { 2024-08-08T20:56:22.9859645Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9860283Z "size": 1909, 2024-08-08T20:56:22.9860907Z "digest": "sha256:b137e3b1350c038bcf530e9081f56d6b79a94cb2c8c56da7aebd04e69ec09c77" 2024-08-08T20:56:22.9861654Z }, 2024-08-08T20:56:22.9861987Z { 2024-08-08T20:56:22.9862506Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9863181Z "size": 700, 2024-08-08T20:56:22.9863851Z "digest": "sha256:fc51810b1926e7228fbf5d689aae1d1dd5a5d720d28318406725a284fd6ec07c" 2024-08-08T20:56:22.9864594Z }, 2024-08-08T20:56:22.9864945Z { 2024-08-08T20:56:22.9865466Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9866125Z "size": 2801669971, 2024-08-08T20:56:22.9866804Z "digest": "sha256:fbcb08d9f4bc971495bd9eeea6116fb1efd8e1b13008d82139c326d75c07a7d5" 2024-08-08T20:56:22.9867499Z }, 2024-08-08T20:56:22.9867863Z { 2024-08-08T20:56:22.9868396Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9869035Z "size": 381, 2024-08-08T20:56:22.9869669Z "digest": "sha256:4192f5853f54248539aefdbe4921d763d158bcde68eabe869e412111d8e29739" 2024-08-08T20:56:22.9870379Z }, 2024-08-08T20:56:22.9870719Z { 2024-08-08T20:56:22.9871288Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9872036Z "size": 12893, 2024-08-08T20:56:22.9872687Z "digest": "sha256:62e5e179e864ff5209fd3a8fc2022df745a07048b3142aea9d90450b85a79f46" 2024-08-08T20:56:22.9873400Z }, 2024-08-08T20:56:22.9873805Z { 2024-08-08T20:56:22.9874399Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9874996Z "size": 803, 2024-08-08T20:56:22.9875676Z "digest": "sha256:3e326a36ef0f1bb7289f721af48968221079ef67dccb0130fa29f6d1e2ffffaa" 2024-08-08T20:56:22.9876497Z }, 2024-08-08T20:56:22.9876799Z { 2024-08-08T20:56:22.9877398Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9878067Z "size": 106, 2024-08-08T20:56:22.9878642Z "digest": "sha256:56d2d1e08c75d74d55ce83ba1f530347659846b032da1f0389ec10ef0e895943" 2024-08-08T20:56:22.9879401Z }, 2024-08-08T20:56:22.9879742Z { 2024-08-08T20:56:22.9880220Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9880920Z "size": 504, 2024-08-08T20:56:22.9881524Z "digest": "sha256:438d85947b431d78959c32a400342dc72d5b95935ce3f0640fb21aa21df254d0" 2024-08-08T20:56:22.9882165Z }, 2024-08-08T20:56:22.9882571Z { 2024-08-08T20:56:22.9883084Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9883715Z "size": 121477346, 2024-08-08T20:56:22.9884449Z "digest": "sha256:65072a4267b4e36b3af50ebd6d8bb81906d21b5320d11a00221adbb8b6495c05" 2024-08-08T20:56:22.9885179Z }, 2024-08-08T20:56:22.9885479Z { 2024-08-08T20:56:22.9886077Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9886707Z "size": 109, 2024-08-08T20:56:22.9887317Z "digest": "sha256:c88cf310204cc37c4bcffa5711ff941f931dfa5b05bcb7728c89ac6967b9b764" 2024-08-08T20:56:22.9888074Z }, 2024-08-08T20:56:22.9888409Z { 2024-08-08T20:56:22.9888911Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9889602Z "size": 489, 2024-08-08T20:56:22.9890233Z "digest": "sha256:a54e8a071aefefb71af4568e1db1da2d7f95f83c11334bf5da54c990dd5d9a60" 2024-08-08T20:56:22.9890922Z }, 2024-08-08T20:56:22.9891314Z { 2024-08-08T20:56:22.9891853Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9892469Z "size": 297, 2024-08-08T20:56:22.9893153Z "digest": "sha256:37cedc534d8875f5e6a116df8843321b6f12d2baef68d7e38a8f784c69e17522" 2024-08-08T20:56:22.9893915Z }, 2024-08-08T20:56:22.9894224Z { 2024-08-08T20:56:22.9894791Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9895555Z "size": 103, 2024-08-08T20:56:22.9896153Z "digest": "sha256:33a4aacd6d0ee8d26a8f108968c61b30bbde076c8412b495b27fdd1376accf3f" 2024-08-08T20:56:22.9896908Z }, 2024-08-08T20:56:22.9897270Z { 2024-08-08T20:56:22.9897746Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9898448Z "size": 1473, 2024-08-08T20:56:22.9899090Z "digest": "sha256:c5585171af7c07862130070fcab84d7ed9f31e6980420fb38f443bceaf612884" 2024-08-08T20:56:22.9899743Z }, 2024-08-08T20:56:22.9900150Z { 2024-08-08T20:56:22.9900670Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9901267Z "size": 424146559, 2024-08-08T20:56:22.9901997Z "digest": "sha256:f956a95cc728a3aee9c3c8bc0b32ebfe598f92dc4cc11ee88fcc17f5cd4bce29" 2024-08-08T20:56:22.9902698Z }, 2024-08-08T20:56:22.9902994Z { 2024-08-08T20:56:22.9903587Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9904271Z "size": 160, 2024-08-08T20:56:22.9904863Z "digest": "sha256:57f752d95ca739b39ff7b9fbc615535f3964a9f8dff4aebc5b050f061f2da576" 2024-08-08T20:56:22.9905653Z }, 2024-08-08T20:56:22.9905983Z { 2024-08-08T20:56:22.9906458Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9907165Z "size": 563, 2024-08-08T20:56:22.9907769Z "digest": "sha256:529bfa1f03893c15b190023497229174fb0a85f98ef370614f33fd6939524704" 2024-08-08T20:56:22.9908410Z }, 2024-08-08T20:56:22.9908822Z { 2024-08-08T20:56:22.9909339Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9909962Z "size": 35867002, 2024-08-08T20:56:22.9910643Z "digest": "sha256:241d45dc1ded76b86b1c877d5887f1692cbf77a7c32b11d5118284993f679510" 2024-08-08T20:56:22.9911333Z }, 2024-08-08T20:56:22.9911651Z { 2024-08-08T20:56:22.9912356Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9913141Z "size": 104, 2024-08-08T20:56:22.9913739Z "digest": "sha256:01e679af7dbce49ea38c30a7f4ed07e5eb6d4986b2ca65458e39d8f1d178a6fc" 2024-08-08T20:56:22.9914529Z }, 2024-08-08T20:56:22.9914890Z { 2024-08-08T20:56:22.9915715Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9916444Z "size": 425, 2024-08-08T20:56:22.9917089Z "digest": "sha256:667d9aa101b1e662567dbbd3e5e9536c9829754dc5ba9dc5dfc57a4c1dfbfaad" 2024-08-08T20:56:22.9917748Z }, 2024-08-08T20:56:22.9918153Z { 2024-08-08T20:56:22.9918693Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9919299Z "size": 20262086, 2024-08-08T20:56:22.9920063Z "digest": "sha256:4597a1cd71c8273d55d59fd733edbe8b77dfa46fa62424313aa610e00bcee9db" 2024-08-08T20:56:22.9920853Z }, 2024-08-08T20:56:22.9921156Z { 2024-08-08T20:56:22.9921749Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9922470Z "size": 439, 2024-08-08T20:56:22.9923105Z "digest": "sha256:d5d1806bb65ac40856eb41fed88289f4565d3c2eeccbc7f13424e91e6515bbd1" 2024-08-08T20:56:22.9923938Z }, 2024-08-08T20:56:22.9924277Z { 2024-08-08T20:56:22.9924791Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9925549Z "size": 700, 2024-08-08T20:56:22.9926184Z "digest": "sha256:fc51810b1926e7228fbf5d689aae1d1dd5a5d720d28318406725a284fd6ec07c" 2024-08-08T20:56:22.9926868Z }, 2024-08-08T20:56:22.9927282Z { 2024-08-08T20:56:22.9927803Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9928402Z "size": 142, 2024-08-08T20:56:22.9929084Z "digest": "sha256:2090aa744d9926715bd4f4e71e7275d460c24ddee297c235e133ae52f6d6557b" 2024-08-08T20:56:22.9929776Z }, 2024-08-08T20:56:22.9930073Z { 2024-08-08T20:56:22.9930658Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9931294Z "size": 135, 2024-08-08T20:56:22.9931888Z "digest": "sha256:508b81f2bbecf87861845ddd9affe1e1ce7717d82c2d8d8446c94d0ae7bb634c" 2024-08-08T20:56:22.9932823Z }, 2024-08-08T20:56:22.9933170Z { 2024-08-08T20:56:22.9933717Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9934458Z "size": 32, 2024-08-08T20:56:22.9935093Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-08T20:56:22.9935780Z }, 2024-08-08T20:56:22.9936200Z { 2024-08-08T20:56:22.9936743Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9937349Z "size": 190, 2024-08-08T20:56:22.9938022Z "digest": "sha256:6501ca0d0154cc806fe5d4b0b62dd865fcbf9981be3236045a2d23ced95b3662" 2024-08-08T20:56:22.9938748Z }, 2024-08-08T20:56:22.9939048Z { 2024-08-08T20:56:22.9939622Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9940310Z "size": 565, 2024-08-08T20:56:22.9940897Z "digest": "sha256:56781236accc9f35d8bba6ab5c965b782e312713eed0b11ab6a623caf5e7c7dc" 2024-08-08T20:56:22.9941661Z }, 2024-08-08T20:56:22.9942029Z { 2024-08-08T20:56:22.9942523Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9943218Z "size": 43163828, 2024-08-08T20:56:22.9943920Z "digest": "sha256:3794293299cd9060f62bdfd80afba731e6643595c0454f57e8635c11299d3970" 2024-08-08T20:56:22.9944572Z }, 2024-08-08T20:56:22.9944968Z { 2024-08-08T20:56:22.9945504Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9946102Z "size": 106, 2024-08-08T20:56:22.9946799Z "digest": "sha256:a4bf37ae774bb6fb81ef08ad416a3f326560824c22261e3af8128e4d1215ebea" 2024-08-08T20:56:22.9947533Z }, 2024-08-08T20:56:22.9947831Z { 2024-08-08T20:56:22.9948421Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9949057Z "size": 1212, 2024-08-08T20:56:22.9949649Z "digest": "sha256:5f92aea02160822d753bb8f832aa0cabeb20fccbb172c708245de3a3b41bbadc" 2024-08-08T20:56:22.9950571Z }, 2024-08-08T20:56:22.9950908Z { 2024-08-08T20:56:22.9951394Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9952198Z "size": 700, 2024-08-08T20:56:22.9952816Z "digest": "sha256:fc51810b1926e7228fbf5d689aae1d1dd5a5d720d28318406725a284fd6ec07c" 2024-08-08T20:56:22.9953498Z }, 2024-08-08T20:56:22.9953897Z { 2024-08-08T20:56:22.9954447Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9955063Z "size": 137, 2024-08-08T20:56:22.9955735Z "digest": "sha256:3976563a126eae89f7e0b04ae1298bb24537eb8dd20d527857d59affeeef0a69" 2024-08-08T20:56:22.9956429Z }, 2024-08-08T20:56:22.9956751Z { 2024-08-08T20:56:22.9957323Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9957957Z "size": 120, 2024-08-08T20:56:22.9958553Z "digest": "sha256:1457a83f811084f2f2e97af9a90728d3816379fc8966bd6a4f13571f15c5411b" 2024-08-08T20:56:22.9959299Z }, 2024-08-08T20:56:22.9959655Z { 2024-08-08T20:56:22.9960134Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9960855Z "size": 1916631161, 2024-08-08T20:56:22.9961560Z "digest": "sha256:f4957e5566dfa8c923498bbccf987233d635fae2478a7ad7dc1c56fa068566b4" 2024-08-08T20:56:22.9962309Z }, 2024-08-08T20:56:22.9962610Z { 2024-08-08T20:56:22.9963153Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9963752Z "size": 172, 2024-08-08T20:56:22.9964450Z "digest": "sha256:ea28e901cc2b64d1805adae2e3de9efae7ce5b497c0b5ae30d6dda7286e85b59" 2024-08-08T20:56:22.9965182Z }, 2024-08-08T20:56:22.9965483Z { 2024-08-08T20:56:22.9966063Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9966721Z "size": 907, 2024-08-08T20:56:22.9967300Z "digest": "sha256:128f963e326f8113050faaf9dc9dce161b567b56ef9b3359eef6948968bc6ba5" 2024-08-08T20:56:22.9968079Z }, 2024-08-08T20:56:22.9968449Z { 2024-08-08T20:56:22.9968930Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9969748Z "size": 700, 2024-08-08T20:56:22.9970373Z "digest": "sha256:fc51810b1926e7228fbf5d689aae1d1dd5a5d720d28318406725a284fd6ec07c" 2024-08-08T20:56:22.9971221Z }, 2024-08-08T20:56:22.9971530Z { 2024-08-08T20:56:22.9972080Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9972837Z "size": 134, 2024-08-08T20:56:22.9973476Z "digest": "sha256:0e037b263d65bc1bbf9b2259030d48c11336cfe1e7003c800b00b761c26428f6" 2024-08-08T20:56:22.9974285Z }, 2024-08-08T20:56:22.9974735Z { 2024-08-08T20:56:22.9975246Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9975937Z "size": 32, 2024-08-08T20:56:22.9976687Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-08T20:56:22.9977352Z }, 2024-08-08T20:56:22.9977686Z { 2024-08-08T20:56:22.9978283Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9978885Z "size": 156, 2024-08-08T20:56:22.9979516Z "digest": "sha256:75143e4e0c7767a475016c209b46830984e3f7389a25ff4a6d9876685802c057" 2024-08-08T20:56:22.9980255Z }, 2024-08-08T20:56:22.9980558Z { 2024-08-08T20:56:22.9981099Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9981835Z "size": 1841, 2024-08-08T20:56:22.9982426Z "digest": "sha256:5f2762373cead6b01747aff43a5fca693608d689e4c0ab8f27f592d2c06c19fc" 2024-08-08T20:56:22.9983143Z }, 2024-08-08T20:56:22.9983525Z { 2024-08-08T20:56:22.9984004Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9984662Z "size": 7529784, 2024-08-08T20:56:22.9985345Z "digest": "sha256:b0fdf0dc9c7d4594c8bcb1938d0554a76479140d120b539fd09d8a79c0fa3062" 2024-08-08T20:56:22.9986006Z }, 2024-08-08T20:56:22.9986367Z { 2024-08-08T20:56:22.9986933Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9987642Z "size": 164, 2024-08-08T20:56:22.9988275Z "digest": "sha256:7cf5bf391d8b5fbe7c63b34e2bbd993b6eb25d5662376f16e6ca06bc1d6eb6c9" 2024-08-08T20:56:22.9989064Z }, 2024-08-08T20:56:22.9989388Z { 2024-08-08T20:56:22.9989904Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9990584Z "size": 7944, 2024-08-08T20:56:22.9991207Z "digest": "sha256:c274269f5d7e256bbe90ebbc9b5218d6aafffa5b2e3abea03c095f59eaf3ee27" 2024-08-08T20:56:22.9991978Z }, 2024-08-08T20:56:22.9992386Z { 2024-08-08T20:56:22.9992868Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9993510Z "size": 8069, 2024-08-08T20:56:22.9994210Z "digest": "sha256:163b636fa1af9a323822eff8881a3fceb1fd8da7c633ce060895c3a4aa93f594" 2024-08-08T20:56:22.9994870Z }, 2024-08-08T20:56:22.9995238Z { 2024-08-08T20:56:22.9995825Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9996430Z "size": 301, 2024-08-08T20:56:22.9997046Z "digest": "sha256:21c5c3d188ac93a5846c6147ef3b23001e97ce8dc9765f5e236710ee84b127f8" 2024-08-08T20:56:22.9997809Z }, 2024-08-08T20:56:22.9998101Z { 2024-08-08T20:56:22.9998620Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:22.9999329Z "size": 7630427, 2024-08-08T20:56:22.9999924Z "digest": "sha256:8befc1fd9685802235434744164c49e50cd83403b4de685426ba7f7fd9dfeaa5" 2024-08-08T20:56:23.0000615Z }, 2024-08-08T20:56:23.0001036Z { 2024-08-08T20:56:23.0001510Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0002232Z "size": 108, 2024-08-08T20:56:23.0002903Z "digest": "sha256:6061510659f35eaeafca4b18e7d6769ff2d0793bd5a7c524b0d853c183131830" 2024-08-08T20:56:23.0003556Z }, 2024-08-08T20:56:23.0003906Z { 2024-08-08T20:56:23.0004466Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0005066Z "size": 54145774, 2024-08-08T20:56:23.0005736Z "digest": "sha256:1baeb6c7bd597982ba5c766b7723957e2c53a616da85f1ea79a0bbbefeaa1bb8" 2024-08-08T20:56:23.0006483Z }, 2024-08-08T20:56:23.0006875Z { 2024-08-08T20:56:23.0007412Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0008097Z "size": 473, 2024-08-08T20:56:23.0008703Z "digest": "sha256:8b31ac4f8898b741c2b9f5ded6e82541cd590de3f0682d37cfdc828d9162674b" 2024-08-08T20:56:23.0009476Z }, 2024-08-08T20:56:23.0009854Z { 2024-08-08T20:56:23.0010348Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0010979Z "size": 1374858366, 2024-08-08T20:56:23.0011675Z "digest": "sha256:64f3015ce9ddf56cae992548f3a7d61fee39a0c9fca4aa9c2aeb66a6413c8877" 2024-08-08T20:56:23.0012356Z }, 2024-08-08T20:56:23.0012688Z { 2024-08-08T20:56:23.0013245Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0013911Z "size": 106, 2024-08-08T20:56:23.0014524Z "digest": "sha256:ee76d9405e95828d1c7f1859f3b9f4f755a9521b24fee3eca27650a3acef9923" 2024-08-08T20:56:23.0015535Z }, 2024-08-08T20:56:23.0015840Z { 2024-08-08T20:56:23.0016435Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0017141Z "size": 558, 2024-08-08T20:56:23.0017727Z "digest": "sha256:35b0f99cd85f08c7dead3225af75e654daaf96653d7516e8a3dcb0f8ad383803" 2024-08-08T20:56:23.0018426Z }, 2024-08-08T20:56:23.0018831Z { 2024-08-08T20:56:23.0019300Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0019932Z "size": 46248419, 2024-08-08T20:56:23.0020623Z "digest": "sha256:cd4355139790949344d68191a7fb9f145f75eb88b58b071e855aadb7e9f68897" 2024-08-08T20:56:23.0021268Z }, 2024-08-08T20:56:23.0021601Z { 2024-08-08T20:56:23.0022187Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0022851Z "size": 111, 2024-08-08T20:56:23.0023471Z "digest": "sha256:6520dd9eea8aa7f640cc677ab3057a3cbad7331a8dc5e597a5038425b9f77b33" 2024-08-08T20:56:23.0024456Z }, 2024-08-08T20:56:23.0024750Z { 2024-08-08T20:56:23.0025351Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0026033Z "size": 32, 2024-08-08T20:56:23.0026614Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-08T20:56:23.0027333Z }, 2024-08-08T20:56:23.0027714Z { 2024-08-08T20:56:23.0028209Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0028840Z "size": 32, 2024-08-08T20:56:23.0029585Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-08T20:56:23.0030266Z }, 2024-08-08T20:56:23.0030596Z { 2024-08-08T20:56:23.0031165Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0031782Z "size": 32, 2024-08-08T20:56:23.0032456Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-08T20:56:23.0033210Z }, 2024-08-08T20:56:23.0033530Z { 2024-08-08T20:56:23.0034041Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2024-08-08T20:56:23.0034746Z "size": 32, 2024-08-08T20:56:23.0035326Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2024-08-08T20:56:23.0036017Z } 2024-08-08T20:56:23.0036487Z ] 2024-08-08T20:56:23.0036779Z } 2024-08-08T20:56:23.0037131Z + exit 0 2024-08-08T20:56:23.0153962Z ##[group]Run tag=${ECR_DOCKER_IMAGE##*/} 2024-08-08T20:56:23.0154499Z tag=${ECR_DOCKER_IMAGE##*/} 2024-08-08T20:56:23.0155157Z echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}" 2024-08-08T20:56:23.0164581Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:23.0165160Z env: 2024-08-08T20:56:23.0165605Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:23.0166713Z ECR_DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:23.0167931Z ##[endgroup] 2024-08-08T20:56:23.0202048Z docker pull ghcr.io/pytorch/ci-image:pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9-02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:23.0461882Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2024-08-08T20:56:23.0462450Z with: 2024-08-08T20:56:23.0463377Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:23.0464548Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:23.0465047Z env: 2024-08-08T20:56:23.0465325Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:23.0465671Z ##[endgroup] 2024-08-08T20:56:23.0483394Z ##[group]Run set -x 2024-08-08T20:56:23.0483768Z set -x 2024-08-08T20:56:23.0484099Z set +e 2024-08-08T20:56:23.0484402Z  2024-08-08T20:56:23.0484681Z login() { 2024-08-08T20:56:23.0485389Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2024-08-08T20:56:23.0486177Z } 2024-08-08T20:56:23.0486459Z  2024-08-08T20:56:23.0486778Z retry () { 2024-08-08T20:56:23.0487176Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2024-08-08T20:56:23.0487653Z } 2024-08-08T20:56:23.0487938Z  2024-08-08T20:56:23.0488259Z retry login "${DOCKER_REGISTRY}" 2024-08-08T20:56:23.0488699Z  2024-08-08T20:56:23.0488979Z set -e 2024-08-08T20:56:23.0489473Z # ignore output since only exit code is used for conditional 2024-08-08T20:56:23.0490228Z # only pull docker image if it's not available locally 2024-08-08T20:56:23.0491055Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2024-08-08T20:56:23.0491805Z  retry docker pull "${DOCKER_IMAGE}" 2024-08-08T20:56:23.0492263Z fi 2024-08-08T20:56:23.0501157Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:56:23.0501831Z env: 2024-08-08T20:56:23.0502109Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:56:23.0503101Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:23.0504250Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:23.0504757Z ##[endgroup] 2024-08-08T20:56:23.0535258Z + set +e 2024-08-08T20:56:23.0535812Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:23.0536451Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:23.0539415Z + aws ecr get-login-password --region us-east-1 2024-08-08T20:56:23.0540934Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2024-08-08T20:56:23.5873215Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2024-08-08T20:56:23.5874092Z Configure a credential helper to remove this warning. See 2024-08-08T20:56:23.5875235Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2024-08-08T20:56:23.5875840Z 2024-08-08T20:56:23.5875982Z Login Succeeded 2024-08-08T20:56:23.5895882Z + set -e 2024-08-08T20:56:23.5897062Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:23.6061928Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:23.6063788Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:56:23.8414424Z 02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9: Pulling from pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9 2024-08-08T20:56:23.8415775Z 63e9bbe32327: Pulling fs layer 2024-08-08T20:56:23.8416320Z cfb3d849840e: Pulling fs layer 2024-08-08T20:56:23.8416820Z 968831e596a6: Pulling fs layer 2024-08-08T20:56:23.8417411Z ea310eb267ca: Pulling fs layer 2024-08-08T20:56:23.8418268Z 3af11d09e9cd: Pulling fs layer 2024-08-08T20:56:23.8418769Z ebfec18059b9: Pulling fs layer 2024-08-08T20:56:23.8419253Z 533b4aebf169: Pulling fs layer 2024-08-08T20:56:23.8419750Z 9dd75d06a091: Pulling fs layer 2024-08-08T20:56:23.8420267Z 30bfca4dd349: Pulling fs layer 2024-08-08T20:56:23.8420789Z 1b57ce94cad9: Pulling fs layer 2024-08-08T20:56:23.8421288Z 9ee6bdb31195: Pulling fs layer 2024-08-08T20:56:23.8421732Z 08be3f7f3cfb: Pulling fs layer 2024-08-08T20:56:23.8422220Z fd57b6413183: Pulling fs layer 2024-08-08T20:56:23.8422647Z 9a1bc15482bf: Pulling fs layer 2024-08-08T20:56:23.8423087Z ea310eb267ca: Waiting 2024-08-08T20:56:23.8423518Z 909b1bf15ecc: Pulling fs layer 2024-08-08T20:56:23.8423987Z 88e7a278ede8: Pulling fs layer 2024-08-08T20:56:23.8424355Z e5d7856b663f: Pulling fs layer 2024-08-08T20:56:23.8424843Z 00a0e1fafdad: Pulling fs layer 2024-08-08T20:56:23.8425342Z 533b4aebf169: Waiting 2024-08-08T20:56:23.8425767Z b137e3b1350c: Pulling fs layer 2024-08-08T20:56:23.8426288Z fc51810b1926: Pulling fs layer 2024-08-08T20:56:23.8426772Z fbcb08d9f4bc: Pulling fs layer 2024-08-08T20:56:23.8427230Z 4192f5853f54: Pulling fs layer 2024-08-08T20:56:23.8427604Z 62e5e179e864: Pulling fs layer 2024-08-08T20:56:23.8427978Z 3e326a36ef0f: Pulling fs layer 2024-08-08T20:56:23.8428330Z 9dd75d06a091: Waiting 2024-08-08T20:56:23.8428660Z 56d2d1e08c75: Pulling fs layer 2024-08-08T20:56:23.8429017Z fd57b6413183: Waiting 2024-08-08T20:56:23.8429324Z 1b57ce94cad9: Waiting 2024-08-08T20:56:23.8429644Z 08be3f7f3cfb: Waiting 2024-08-08T20:56:23.8430067Z 00a0e1fafdad: Waiting 2024-08-08T20:56:23.8430492Z 438d85947b43: Pulling fs layer 2024-08-08T20:56:23.8430969Z 65072a4267b4: Pulling fs layer 2024-08-08T20:56:23.8431371Z b137e3b1350c: Waiting 2024-08-08T20:56:23.8431695Z c88cf310204c: Pulling fs layer 2024-08-08T20:56:23.8432396Z fc51810b1926: Waiting 2024-08-08T20:56:23.8432709Z 9a1bc15482bf: Waiting 2024-08-08T20:56:23.8433033Z a54e8a071aef: Pulling fs layer 2024-08-08T20:56:23.8433409Z 37cedc534d88: Pulling fs layer 2024-08-08T20:56:23.8433792Z 33a4aacd6d0e: Pulling fs layer 2024-08-08T20:56:23.8434160Z c5585171af7c: Pulling fs layer 2024-08-08T20:56:23.8434540Z f956a95cc728: Pulling fs layer 2024-08-08T20:56:23.8434910Z 57f752d95ca7: Pulling fs layer 2024-08-08T20:56:23.8435257Z ebfec18059b9: Waiting 2024-08-08T20:56:23.8435592Z 529bfa1f0389: Pulling fs layer 2024-08-08T20:56:23.8435945Z 62e5e179e864: Waiting 2024-08-08T20:56:23.8436352Z 241d45dc1ded: Pulling fs layer 2024-08-08T20:56:23.8436710Z 3e326a36ef0f: Waiting 2024-08-08T20:56:23.8437092Z fbcb08d9f4bc: Waiting 2024-08-08T20:56:23.8437627Z 01e679af7dbc: Pulling fs layer 2024-08-08T20:56:23.8438045Z 4192f5853f54: Waiting 2024-08-08T20:56:23.8438430Z 667d9aa101b1: Pulling fs layer 2024-08-08T20:56:23.8438905Z 909b1bf15ecc: Waiting 2024-08-08T20:56:23.8439310Z 438d85947b43: Waiting 2024-08-08T20:56:23.8439711Z 65072a4267b4: Waiting 2024-08-08T20:56:23.8440034Z 4597a1cd71c8: Pulling fs layer 2024-08-08T20:56:23.8440393Z 37cedc534d88: Waiting 2024-08-08T20:56:23.8440727Z d5d1806bb65a: Pulling fs layer 2024-08-08T20:56:23.8441078Z 33a4aacd6d0e: Waiting 2024-08-08T20:56:23.8441395Z a54e8a071aef: Waiting 2024-08-08T20:56:23.8441717Z 9ee6bdb31195: Waiting 2024-08-08T20:56:23.8442020Z 30bfca4dd349: Waiting 2024-08-08T20:56:23.8442337Z 3af11d09e9cd: Waiting 2024-08-08T20:56:23.8442651Z 241d45dc1ded: Waiting 2024-08-08T20:56:23.8442954Z 01e679af7dbc: Waiting 2024-08-08T20:56:23.8443260Z f956a95cc728: Waiting 2024-08-08T20:56:23.8443568Z 667d9aa101b1: Waiting 2024-08-08T20:56:23.8443877Z 4597a1cd71c8: Waiting 2024-08-08T20:56:23.8444235Z 529bfa1f0389: Waiting 2024-08-08T20:56:23.8444559Z 2090aa744d99: Pulling fs layer 2024-08-08T20:56:23.8444910Z d5d1806bb65a: Waiting 2024-08-08T20:56:23.8445227Z 56d2d1e08c75: Waiting 2024-08-08T20:56:23.8445555Z 508b81f2bbec: Pulling fs layer 2024-08-08T20:56:23.8445933Z 4f4fb700ef54: Pulling fs layer 2024-08-08T20:56:23.8446296Z 2090aa744d99: Waiting 2024-08-08T20:56:23.8446618Z 6501ca0d0154: Pulling fs layer 2024-08-08T20:56:23.8447163Z c88cf310204c: Waiting 2024-08-08T20:56:23.8447495Z 56781236accc: Pulling fs layer 2024-08-08T20:56:23.8447854Z 508b81f2bbec: Waiting 2024-08-08T20:56:23.8448157Z 4f4fb700ef54: Waiting 2024-08-08T20:56:23.8448483Z 3794293299cd: Pulling fs layer 2024-08-08T20:56:23.8448988Z a4bf37ae774b: Pulling fs layer 2024-08-08T20:56:23.8449467Z 5f92aea02160: Pulling fs layer 2024-08-08T20:56:23.8450003Z 88e7a278ede8: Waiting 2024-08-08T20:56:23.8450367Z c5585171af7c: Waiting 2024-08-08T20:56:23.8450794Z 56781236accc: Waiting 2024-08-08T20:56:23.8451179Z 3976563a126e: Pulling fs layer 2024-08-08T20:56:23.8451674Z 1457a83f8110: Pulling fs layer 2024-08-08T20:56:23.8452144Z 5f92aea02160: Waiting 2024-08-08T20:56:23.8452564Z a4bf37ae774b: Waiting 2024-08-08T20:56:23.8453002Z f4957e5566df: Pulling fs layer 2024-08-08T20:56:23.8453606Z ea28e901cc2b: Pulling fs layer 2024-08-08T20:56:23.8454181Z 57f752d95ca7: Waiting 2024-08-08T20:56:23.8454590Z 1457a83f8110: Waiting 2024-08-08T20:56:23.8455012Z 128f963e326f: Pulling fs layer 2024-08-08T20:56:23.8455503Z 128f963e326f: Waiting 2024-08-08T20:56:23.8455939Z 0e037b263d65: Pulling fs layer 2024-08-08T20:56:23.8456423Z ea28e901cc2b: Waiting 2024-08-08T20:56:23.8456840Z 3794293299cd: Waiting 2024-08-08T20:56:23.8457281Z 75143e4e0c77: Pulling fs layer 2024-08-08T20:56:23.8457782Z 5f2762373cea: Pulling fs layer 2024-08-08T20:56:23.8458300Z b0fdf0dc9c7d: Pulling fs layer 2024-08-08T20:56:23.8458816Z 7cf5bf391d8b: Pulling fs layer 2024-08-08T20:56:23.8459314Z c274269f5d7e: Pulling fs layer 2024-08-08T20:56:23.8459820Z 163b636fa1af: Pulling fs layer 2024-08-08T20:56:23.8460301Z 5f2762373cea: Waiting 2024-08-08T20:56:23.8460712Z b0fdf0dc9c7d: Waiting 2024-08-08T20:56:23.8461167Z 21c5c3d188ac: Pulling fs layer 2024-08-08T20:56:23.8461657Z 8befc1fd9685: Pulling fs layer 2024-08-08T20:56:23.8462132Z e5d7856b663f: Waiting 2024-08-08T20:56:23.8462696Z 6061510659f3: Pulling fs layer 2024-08-08T20:56:23.8463178Z 1baeb6c7bd59: Pulling fs layer 2024-08-08T20:56:23.8463656Z 8b31ac4f8898: Pulling fs layer 2024-08-08T20:56:23.8464163Z 64f3015ce9dd: Pulling fs layer 2024-08-08T20:56:23.8464638Z 6061510659f3: Waiting 2024-08-08T20:56:23.8465060Z ee76d9405e95: Pulling fs layer 2024-08-08T20:56:23.8465558Z 35b0f99cd85f: Pulling fs layer 2024-08-08T20:56:23.8466040Z 1baeb6c7bd59: Waiting 2024-08-08T20:56:23.8466453Z cd4355139790: Pulling fs layer 2024-08-08T20:56:23.8466945Z 6520dd9eea8a: Pulling fs layer 2024-08-08T20:56:23.8467416Z ee76d9405e95: Waiting 2024-08-08T20:56:23.8467803Z cd4355139790: Waiting 2024-08-08T20:56:23.8468202Z 6520dd9eea8a: Waiting 2024-08-08T20:56:23.8468626Z 35b0f99cd85f: Waiting 2024-08-08T20:56:23.8469029Z 7cf5bf391d8b: Waiting 2024-08-08T20:56:23.8469437Z 163b636fa1af: Waiting 2024-08-08T20:56:23.8469844Z 6501ca0d0154: Waiting 2024-08-08T20:56:23.8470215Z 8b31ac4f8898: Waiting 2024-08-08T20:56:23.9733095Z cfb3d849840e: Verifying Checksum 2024-08-08T20:56:23.9733692Z cfb3d849840e: Download complete 2024-08-08T20:56:24.0430002Z ea310eb267ca: Verifying Checksum 2024-08-08T20:56:24.0430575Z ea310eb267ca: Download complete 2024-08-08T20:56:24.1071019Z 3af11d09e9cd: Verifying Checksum 2024-08-08T20:56:24.1071567Z 3af11d09e9cd: Download complete 2024-08-08T20:56:24.1836165Z 63e9bbe32327: Verifying Checksum 2024-08-08T20:56:24.1836841Z 63e9bbe32327: Download complete 2024-08-08T20:56:24.2718693Z 533b4aebf169: Download complete 2024-08-08T20:56:24.3529019Z 9dd75d06a091: Verifying Checksum 2024-08-08T20:56:24.3530510Z 9dd75d06a091: Download complete 2024-08-08T20:56:24.4442740Z 30bfca4dd349: Verifying Checksum 2024-08-08T20:56:24.4444975Z 30bfca4dd349: Download complete 2024-08-08T20:56:24.4685059Z 968831e596a6: Verifying Checksum 2024-08-08T20:56:24.4685631Z 968831e596a6: Download complete 2024-08-08T20:56:24.5522090Z 9ee6bdb31195: Verifying Checksum 2024-08-08T20:56:24.5522718Z 9ee6bdb31195: Download complete 2024-08-08T20:56:24.6166476Z 08be3f7f3cfb: Verifying Checksum 2024-08-08T20:56:25.3507701Z 63e9bbe32327: Pull complete 2024-08-08T20:56:25.6648709Z cfb3d849840e: Pull complete 2024-08-08T20:56:26.6252244Z 968831e596a6: Pull complete 2024-08-08T20:56:26.6429060Z ea310eb267ca: Pull complete 2024-08-08T20:56:26.6605400Z 3af11d09e9cd: Pull complete 2024-08-08T20:56:27.1291083Z fd57b6413183: Verifying Checksum 2024-08-08T20:56:27.1291559Z fd57b6413183: Download complete 2024-08-08T20:56:27.1991640Z 9a1bc15482bf: Verifying Checksum 2024-08-08T20:56:27.1992301Z 9a1bc15482bf: Download complete 2024-08-08T20:56:27.2827777Z 909b1bf15ecc: Verifying Checksum 2024-08-08T20:56:27.2828373Z 909b1bf15ecc: Download complete 2024-08-08T20:56:27.3544960Z 88e7a278ede8: Verifying Checksum 2024-08-08T20:56:27.3545574Z 88e7a278ede8: Download complete 2024-08-08T20:56:28.3160438Z e5d7856b663f: Download complete 2024-08-08T20:56:28.4087749Z 00a0e1fafdad: Download complete 2024-08-08T20:56:28.4845962Z b137e3b1350c: Download complete 2024-08-08T20:56:28.5670256Z fc51810b1926: Download complete 2024-08-08T20:56:37.7730932Z ebfec18059b9: Verifying Checksum 2024-08-08T20:56:37.8588846Z ebfec18059b9: Download complete 2024-08-08T20:56:37.8589450Z 4192f5853f54: Verifying Checksum 2024-08-08T20:56:37.8589921Z 4192f5853f54: Download complete 2024-08-08T20:56:37.9285756Z 62e5e179e864: Download complete 2024-08-08T20:56:37.9967804Z 3e326a36ef0f: Verifying Checksum 2024-08-08T20:56:37.9968281Z 3e326a36ef0f: Download complete 2024-08-08T20:56:38.0683210Z 56d2d1e08c75: Download complete 2024-08-08T20:56:38.1412326Z 438d85947b43: Verifying Checksum 2024-08-08T20:56:38.1412771Z 438d85947b43: Download complete 2024-08-08T20:56:39.4157220Z 65072a4267b4: Verifying Checksum 2024-08-08T20:56:39.4157759Z 65072a4267b4: Download complete 2024-08-08T20:56:39.4933271Z c88cf310204c: Download complete 2024-08-08T20:56:39.5769715Z a54e8a071aef: Verifying Checksum 2024-08-08T20:56:39.5770270Z a54e8a071aef: Download complete 2024-08-08T20:56:39.6500962Z 37cedc534d88: Verifying Checksum 2024-08-08T20:56:39.6501924Z 37cedc534d88: Download complete 2024-08-08T20:56:39.7324309Z 33a4aacd6d0e: Verifying Checksum 2024-08-08T20:56:39.7324789Z 33a4aacd6d0e: Download complete 2024-08-08T20:56:39.7978900Z c5585171af7c: Verifying Checksum 2024-08-08T20:56:39.7979483Z c5585171af7c: Download complete 2024-08-08T20:56:44.2256099Z f956a95cc728: Verifying Checksum 2024-08-08T20:56:44.2256580Z f956a95cc728: Download complete 2024-08-08T20:56:44.3005434Z 57f752d95ca7: Download complete 2024-08-08T20:56:44.3715997Z 529bfa1f0389: Verifying Checksum 2024-08-08T20:56:44.3716454Z 529bfa1f0389: Download complete 2024-08-08T20:56:44.7793995Z 241d45dc1ded: Verifying Checksum 2024-08-08T20:56:44.7794546Z 241d45dc1ded: Download complete 2024-08-08T20:56:44.8518976Z 01e679af7dbc: Verifying Checksum 2024-08-08T20:56:44.8519483Z 01e679af7dbc: Download complete 2024-08-08T20:56:44.9599287Z 667d9aa101b1: Verifying Checksum 2024-08-08T20:56:44.9599768Z 667d9aa101b1: Download complete 2024-08-08T20:56:45.2160484Z 4597a1cd71c8: Verifying Checksum 2024-08-08T20:56:45.2160982Z 4597a1cd71c8: Download complete 2024-08-08T20:56:45.2936412Z d5d1806bb65a: Verifying Checksum 2024-08-08T20:56:45.2937092Z d5d1806bb65a: Download complete 2024-08-08T20:56:45.3740462Z 2090aa744d99: Verifying Checksum 2024-08-08T20:56:45.3740906Z 2090aa744d99: Download complete 2024-08-08T20:56:45.4579933Z 508b81f2bbec: Download complete 2024-08-08T20:56:45.4698170Z 4f4fb700ef54: Verifying Checksum 2024-08-08T20:56:45.4698765Z 4f4fb700ef54: Download complete 2024-08-08T20:56:45.5439769Z 6501ca0d0154: Download complete 2024-08-08T20:56:45.6115638Z 56781236accc: Download complete 2024-08-08T20:56:46.0961718Z 3794293299cd: Verifying Checksum 2024-08-08T20:56:46.0962182Z 3794293299cd: Download complete 2024-08-08T20:56:46.1721528Z a4bf37ae774b: Download complete 2024-08-08T20:56:46.2451631Z 5f92aea02160: Verifying Checksum 2024-08-08T20:56:46.2452170Z 5f92aea02160: Download complete 2024-08-08T20:56:46.3127612Z 3976563a126e: Download complete 2024-08-08T20:56:46.3966654Z 1457a83f8110: Verifying Checksum 2024-08-08T20:56:46.3967099Z 1457a83f8110: Download complete 2024-08-08T20:56:50.5757709Z 1b57ce94cad9: Verifying Checksum 2024-08-08T20:56:50.5758634Z 1b57ce94cad9: Download complete 2024-08-08T20:56:50.6569859Z ea28e901cc2b: Download complete 2024-08-08T20:56:50.7339254Z 128f963e326f: Download complete 2024-08-08T20:56:50.8714046Z 75143e4e0c77: Verifying Checksum 2024-08-08T20:56:50.8714601Z 75143e4e0c77: Download complete 2024-08-08T20:56:50.9417802Z 5f2762373cea: Download complete 2024-08-08T20:56:51.1519158Z b0fdf0dc9c7d: Verifying Checksum 2024-08-08T20:56:51.1519647Z b0fdf0dc9c7d: Download complete 2024-08-08T20:56:51.2368458Z 7cf5bf391d8b: Verifying Checksum 2024-08-08T20:56:51.2368908Z 7cf5bf391d8b: Download complete 2024-08-08T20:56:51.3040548Z c274269f5d7e: Verifying Checksum 2024-08-08T20:56:51.3041058Z c274269f5d7e: Download complete 2024-08-08T20:56:51.3836390Z 163b636fa1af: Download complete 2024-08-08T20:56:51.4613001Z 21c5c3d188ac: Download complete 2024-08-08T20:56:51.5257006Z ebfec18059b9: Pull complete 2024-08-08T20:56:51.5949353Z 8befc1fd9685: Verifying Checksum 2024-08-08T20:56:51.5949832Z 8befc1fd9685: Download complete 2024-08-08T20:56:51.6635674Z 6061510659f3: Download complete 2024-08-08T20:56:51.7320596Z 533b4aebf169: Pull complete 2024-08-08T20:56:51.9437012Z 9dd75d06a091: Pull complete 2024-08-08T20:56:52.1717869Z 30bfca4dd349: Pull complete 2024-08-08T20:56:52.2604469Z 1baeb6c7bd59: Verifying Checksum 2024-08-08T20:56:52.2605139Z 1baeb6c7bd59: Download complete 2024-08-08T20:56:52.3503668Z 8b31ac4f8898: Download complete 2024-08-08T20:56:56.6647102Z fbcb08d9f4bc: Verifying Checksum 2024-08-08T20:56:56.6647585Z fbcb08d9f4bc: Download complete 2024-08-08T20:56:56.7592704Z ee76d9405e95: Download complete 2024-08-08T20:56:56.8324205Z 35b0f99cd85f: Verifying Checksum 2024-08-08T20:56:56.8324760Z 35b0f99cd85f: Download complete 2024-08-08T20:56:57.3475314Z cd4355139790: Verifying Checksum 2024-08-08T20:56:57.3475942Z cd4355139790: Download complete 2024-08-08T20:56:57.4149812Z 6520dd9eea8a: Verifying Checksum 2024-08-08T20:56:57.4150247Z 6520dd9eea8a: Download complete 2024-08-08T20:57:05.6231511Z f4957e5566df: Verifying Checksum 2024-08-08T20:57:06.1558083Z 64f3015ce9dd: Verifying Checksum 2024-08-08T20:57:06.1558575Z 64f3015ce9dd: Download complete 2024-08-08T20:57:23.2185622Z 1b57ce94cad9: Pull complete 2024-08-08T20:57:23.4149014Z 9ee6bdb31195: Pull complete 2024-08-08T20:57:23.6518562Z 08be3f7f3cfb: Pull complete 2024-08-08T20:57:32.4170872Z fd57b6413183: Pull complete 2024-08-08T20:57:32.6323284Z 9a1bc15482bf: Pull complete 2024-08-08T20:57:32.7813554Z 909b1bf15ecc: Pull complete 2024-08-08T20:57:32.9526837Z 88e7a278ede8: Pull complete 2024-08-08T20:57:35.5509916Z e5d7856b663f: Pull complete 2024-08-08T20:57:35.7674581Z 00a0e1fafdad: Pull complete 2024-08-08T20:57:35.9920227Z b137e3b1350c: Pull complete 2024-08-08T20:57:36.2216825Z fc51810b1926: Pull complete 2024-08-08T20:58:31.5443675Z fbcb08d9f4bc: Pull complete 2024-08-08T20:58:31.7719298Z 4192f5853f54: Pull complete 2024-08-08T20:58:32.0231701Z 62e5e179e864: Pull complete 2024-08-08T20:58:32.2476087Z 3e326a36ef0f: Pull complete 2024-08-08T20:58:32.4691876Z 56d2d1e08c75: Pull complete 2024-08-08T20:58:32.6829770Z 438d85947b43: Pull complete 2024-08-08T20:58:36.1770415Z 65072a4267b4: Pull complete 2024-08-08T20:58:36.2543824Z c88cf310204c: Pull complete 2024-08-08T20:58:36.3776547Z a54e8a071aef: Pull complete 2024-08-08T20:58:36.5030886Z 37cedc534d88: Pull complete 2024-08-08T20:58:36.6716627Z 33a4aacd6d0e: Pull complete 2024-08-08T20:58:36.8487203Z c5585171af7c: Pull complete 2024-08-08T20:58:46.5585122Z f956a95cc728: Pull complete 2024-08-08T20:58:46.7862612Z 57f752d95ca7: Pull complete 2024-08-08T20:58:47.0215512Z 529bfa1f0389: Pull complete 2024-08-08T20:58:47.9198488Z 241d45dc1ded: Pull complete 2024-08-08T20:58:48.0234510Z 01e679af7dbc: Pull complete 2024-08-08T20:58:48.2208447Z 667d9aa101b1: Pull complete 2024-08-08T20:58:48.6854607Z 4597a1cd71c8: Pull complete 2024-08-08T20:58:48.9246018Z d5d1806bb65a: Pull complete 2024-08-08T20:58:49.3453421Z 2090aa744d99: Pull complete 2024-08-08T20:58:49.5667382Z 508b81f2bbec: Pull complete 2024-08-08T20:58:49.7770498Z 4f4fb700ef54: Pull complete 2024-08-08T20:58:49.8763166Z 6501ca0d0154: Pull complete 2024-08-08T20:58:50.0776802Z 56781236accc: Pull complete 2024-08-08T20:58:52.7457599Z 3794293299cd: Pull complete 2024-08-08T20:58:52.8711462Z a4bf37ae774b: Pull complete 2024-08-08T20:58:53.0127381Z 5f92aea02160: Pull complete 2024-08-08T20:58:53.3473015Z 3976563a126e: Pull complete 2024-08-08T20:58:53.5121332Z 1457a83f8110: Pull complete 2024-08-08T20:59:28.5758871Z f4957e5566df: Pull complete 2024-08-08T20:59:28.7205574Z ea28e901cc2b: Pull complete 2024-08-08T20:59:28.7411430Z 128f963e326f: Pull complete 2024-08-08T20:59:28.7823143Z 0e037b263d65: Pull complete 2024-08-08T20:59:28.8183112Z 75143e4e0c77: Pull complete 2024-08-08T20:59:28.8359574Z 5f2762373cea: Pull complete 2024-08-08T20:59:29.0037535Z b0fdf0dc9c7d: Pull complete 2024-08-08T20:59:29.0223431Z 7cf5bf391d8b: Pull complete 2024-08-08T20:59:29.0410023Z c274269f5d7e: Pull complete 2024-08-08T20:59:29.0601059Z 163b636fa1af: Pull complete 2024-08-08T20:59:29.0809104Z 21c5c3d188ac: Pull complete 2024-08-08T20:59:30.2162942Z 8befc1fd9685: Pull complete 2024-08-08T20:59:30.2338719Z 6061510659f3: Pull complete 2024-08-08T20:59:32.3100484Z 1baeb6c7bd59: Pull complete 2024-08-08T20:59:32.5371784Z 8b31ac4f8898: Pull complete 2024-08-08T20:59:45.7801791Z 64f3015ce9dd: Pull complete 2024-08-08T20:59:46.0054606Z ee76d9405e95: Pull complete 2024-08-08T20:59:46.2145916Z 35b0f99cd85f: Pull complete 2024-08-08T20:59:47.0886987Z cd4355139790: Pull complete 2024-08-08T20:59:47.3080804Z 6520dd9eea8a: Pull complete 2024-08-08T20:59:48.2860587Z Digest: sha256:a898a1b237f3fdd8f48bb276749a44016150d2c25ae891805a3c665b34fc8003 2024-08-08T20:59:48.3325785Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:59:48.3546168Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T20:59:48.3620436Z ##[group]Run echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> "$GITHUB_OUTPUT" 2024-08-08T20:59:48.3621359Z echo "IN_ARC_RUNNER=$([ -f /.inarc ] && echo true || echo false)" >> "$GITHUB_OUTPUT" 2024-08-08T20:59:48.3633860Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T20:59:48.3634355Z env: 2024-08-08T20:59:48.3634630Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:59:48.3634970Z ##[endgroup] 2024-08-08T20:59:48.3832747Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2024-08-08T20:59:48.3833288Z with: 2024-08-08T20:59:48.3833571Z driver-version: 550.54.15 2024-08-08T20:59:48.3833916Z env: 2024-08-08T20:59:48.3834180Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:59:48.3834522Z ##[endgroup] 2024-08-08T20:59:48.3952410Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2024-08-08T20:59:48.3953050Z with: 2024-08-08T20:59:48.3953324Z timeout_minutes: 10 2024-08-08T20:59:48.3953651Z max_attempts: 3 2024-08-08T20:59:48.3985816Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils if [[ "${DISTRIBUTION}" == "amzn2023" ]] ; then YUM_REPO_URL="https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo" else # Amazon Linux 2 YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" fi sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y nvidia-docker2 sudo systemctl restart docker ) } install_nvidia_docker2_ubuntu20() { ( set -x # Install nvidia-driver package if not installed status="$(dpkg-query -W --showformat='${db:Status-Status}' nvidia-docker2 2>&1)" if [ ! $? = 0 ] || [ ! "$status" = installed ]; then sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker fi ) } pre_install_nvidia_driver_amzn2() { ( # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms ) } install_nvidia_driver_common() { ( # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then # CAUTION: this may need to be updated in future if [ "${DISTRIBUTION}" != ubuntu20.04 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight fi sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi ) } post_install_nvidia_driver_common() { ( sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi # NB: Annoyingly, nvidia-smi command returns successfully with return code 0 even in # the case where the driver has already crashed as it still can get the driver version # and some basic information like the bus ID. However, the rest of the information # would be missing (ERR!), for example: # # +-----------------------------------------------------------------------------+ # | NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 | # |-------------------------------+----------------------+----------------------+ # | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | # | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | # | | | MIG M. | # |===============================+======================+======================| # | 0 ERR! Off | 00000000:00:1E.0 Off | ERR! | # |ERR! ERR! ERR! ERR! / ERR! | 4184MiB / 23028MiB | ERR! Default | # | | | ERR! | # +-------------------------------+----------------------+----------------------+ # # +-----------------------------------------------------------------------------+ # | Processes: | # | GPU GI CI PID Type Process name GPU Memory | # | ID ID Usage | # |=============================================================================| # +-----------------------------------------------------------------------------+ # # This should be reported as a failure instead as it will guarantee to fail when # Docker tries to run with --gpus all # # So, the correct check here is to query one of the missing piece of info like # GPU name, so that the command can fail accordingly nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } install_nvidia_driver_amzn2() { ( set -x pre_install_nvidia_driver_amzn2 install_nvidia_driver_common post_install_nvidia_driver_common ) } install_nvidia_driver_ubuntu20() { ( set -x install_nvidia_driver_common post_install_nvidia_driver_common ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; ubuntu20.04) install_nvidia_driver_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; ubuntu20.04) install_nvidia_docker2_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" # Fix https://github.com/NVIDIA/nvidia-docker/issues/1648 on runners with # more than one GPUs. This just needs to be run once. The command fails # on subsequent runs and complains that the mode is already on, but that's # ok sudo nvidia-persistenced || true # This should show persistence mode ON nvidia-smi 2024-08-08T20:59:48.4019303Z retry_wait_seconds: 10 2024-08-08T20:59:48.4019666Z polling_interval_seconds: 1 2024-08-08T20:59:48.4020032Z warning_on_retry: true 2024-08-08T20:59:48.4020378Z continue_on_error: false 2024-08-08T20:59:48.4020709Z env: 2024-08-08T20:59:48.4020974Z GIT_DEFAULT_BRANCH: main 2024-08-08T20:59:48.4021329Z DRIVER_VERSION: 550.54.15 2024-08-08T20:59:48.4021682Z ##[endgroup] 2024-08-08T20:59:48.4842617Z == Installing nvidia driver NVIDIA-Linux-x86_64-550.54.15.run == 2024-08-08T20:59:48.4843362Z + pre_install_nvidia_driver_amzn2 2024-08-08T20:59:48.4845112Z + sudo yum remove -y nvidia-driver-latest-dkms 2024-08-08T20:59:48.8705231Z No match for argument: nvidia-driver-latest-dkms 2024-08-08T20:59:48.8706020Z No packages marked for removal. 2024-08-08T20:59:48.8769178Z Dependencies resolved. 2024-08-08T20:59:48.8779911Z Nothing to do. 2024-08-08T20:59:48.8780362Z Complete! 2024-08-08T20:59:48.9659463Z + install_nvidia_driver_common 2024-08-08T20:59:48.9662857Z + echo 'Before installing NVIDIA driver' 2024-08-08T20:59:48.9663299Z + lspci 2024-08-08T20:59:48.9663592Z Before installing NVIDIA driver 2024-08-08T20:59:48.9774307Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2024-08-08T20:59:48.9775212Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2024-08-08T20:59:48.9776144Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2024-08-08T20:59:48.9777189Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2024-08-08T20:59:48.9778018Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2024-08-08T20:59:48.9778793Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2024-08-08T20:59:48.9779513Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2024-08-08T20:59:48.9780310Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2024-08-08T20:59:48.9780885Z + lsmod 2024-08-08T20:59:48.9823030Z Module Size Used by 2024-08-08T20:59:48.9823560Z ib_core 454656 0 2024-08-08T20:59:48.9823929Z veth 36864 0 2024-08-08T20:59:48.9824293Z nvidia_modeset 1351680 0 2024-08-08T20:59:48.9824704Z video 65536 1 nvidia_modeset 2024-08-08T20:59:48.9825137Z wmi 36864 1 video 2024-08-08T20:59:48.9828724Z nvidia_uvm 4706304 0 2024-08-08T20:59:48.9829402Z nvidia 54071296 7 nvidia_uvm,nvidia_modeset 2024-08-08T20:59:48.9830066Z drm 602112 1 nvidia 2024-08-08T20:59:48.9830683Z drm_panel_orientation_quirks 28672 1 drm 2024-08-08T20:59:48.9831249Z backlight 24576 3 video,drm,nvidia_modeset 2024-08-08T20:59:48.9832125Z i2c_core 106496 2 nvidia,drm 2024-08-08T20:59:48.9832752Z xt_conntrack 16384 1 2024-08-08T20:59:48.9833265Z nft_chain_nat 16384 3 2024-08-08T20:59:48.9833776Z xt_MASQUERADE 20480 1 2024-08-08T20:59:48.9834386Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2024-08-08T20:59:48.9834898Z nf_conntrack_netlink 57344 0 2024-08-08T20:59:48.9835701Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2024-08-08T20:59:48.9836490Z nf_defrag_ipv6 24576 1 nf_conntrack 2024-08-08T20:59:48.9836984Z nf_defrag_ipv4 16384 1 nf_conntrack 2024-08-08T20:59:48.9837595Z xfrm_user 57344 1 2024-08-08T20:59:48.9838142Z xfrm_algo 16384 1 xfrm_user 2024-08-08T20:59:48.9838655Z xt_addrtype 16384 2 2024-08-08T20:59:48.9839123Z nft_compat 20480 4 2024-08-08T20:59:48.9839570Z nf_tables 307200 57 nft_compat,nft_chain_nat 2024-08-08T20:59:48.9840177Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2024-08-08T20:59:48.9840712Z br_netfilter 36864 0 2024-08-08T20:59:48.9841118Z bridge 307200 1 br_netfilter 2024-08-08T20:59:48.9841568Z stp 16384 1 bridge 2024-08-08T20:59:48.9842151Z llc 16384 2 bridge,stp 2024-08-08T20:59:48.9842612Z overlay 167936 0 2024-08-08T20:59:48.9842985Z tls 114688 0 2024-08-08T20:59:48.9843350Z nls_ascii 16384 1 2024-08-08T20:59:48.9843707Z nls_cp437 20480 1 2024-08-08T20:59:48.9844075Z sunrpc 692224 1 2024-08-08T20:59:48.9844446Z vfat 24576 1 2024-08-08T20:59:48.9844805Z fat 86016 1 vfat 2024-08-08T20:59:48.9845207Z ena 167936 0 2024-08-08T20:59:48.9845697Z ghash_clmulni_intel 16384 0 2024-08-08T20:59:48.9846190Z ptp 36864 1 ena 2024-08-08T20:59:48.9846705Z pps_core 24576 1 ptp 2024-08-08T20:59:48.9847133Z aesni_intel 393216 0 2024-08-08T20:59:48.9847490Z i8042 45056 0 2024-08-08T20:59:48.9847861Z serio 28672 3 i8042 2024-08-08T20:59:48.9848312Z crypto_simd 16384 1 aesni_intel 2024-08-08T20:59:48.9848954Z cryptd 28672 2 crypto_simd,ghash_clmulni_intel 2024-08-08T20:59:48.9849600Z button 24576 0 2024-08-08T20:59:48.9850081Z sch_fq_codel 20480 17 2024-08-08T20:59:48.9850556Z dm_mod 188416 0 2024-08-08T20:59:48.9851016Z fuse 163840 1 2024-08-08T20:59:48.9851493Z loop 36864 0 2024-08-08T20:59:48.9852198Z dax 45056 1 dm_mod 2024-08-08T20:59:48.9852637Z configfs 57344 1 2024-08-08T20:59:48.9853040Z dmi_sysfs 20480 0 2024-08-08T20:59:48.9853408Z crc32_pclmul 16384 0 2024-08-08T20:59:48.9853767Z crc32c_intel 24576 0 2024-08-08T20:59:48.9854134Z efivarfs 24576 1 2024-08-08T20:59:48.9854493Z + modinfo nvidia 2024-08-08T20:59:48.9855228Z filename: /lib/modules/6.1.94-99.176.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2024-08-08T20:59:48.9855921Z alias: char-major-195-* 2024-08-08T20:59:48.9856412Z version: 550.54.15 2024-08-08T20:59:48.9856749Z supported: external 2024-08-08T20:59:48.9857088Z license: NVIDIA 2024-08-08T20:59:48.9857461Z firmware: nvidia/550.54.15/gsp_tu10x.bin 2024-08-08T20:59:48.9857936Z firmware: nvidia/550.54.15/gsp_ga10x.bin 2024-08-08T20:59:48.9858422Z srcversion: 833721318DA517F0C2FEC97 2024-08-08T20:59:48.9858913Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2024-08-08T20:59:48.9859396Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2024-08-08T20:59:48.9859880Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2024-08-08T20:59:48.9860377Z depends: i2c-core,drm 2024-08-08T20:59:48.9860725Z retpoline: Y 2024-08-08T20:59:48.9861035Z name: nvidia 2024-08-08T20:59:48.9861634Z vermagic: 6.1.94-99.176.amzn2023.x86_64 SMP preempt mod_unload modversions 2024-08-08T20:59:48.9862323Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2024-08-08T20:59:48.9862965Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2024-08-08T20:59:48.9863547Z parm: NVreg_ResmanDebugLevel:int 2024-08-08T20:59:48.9863988Z parm: NVreg_RmLogonRC:int 2024-08-08T20:59:48.9864410Z parm: NVreg_ModifyDeviceFiles:int 2024-08-08T20:59:48.9864865Z parm: NVreg_DeviceFileUID:int 2024-08-08T20:59:48.9865296Z parm: NVreg_DeviceFileGID:int 2024-08-08T20:59:48.9865730Z parm: NVreg_DeviceFileMode:int 2024-08-08T20:59:48.9866250Z parm: NVreg_InitializeSystemMemoryAllocations:int 2024-08-08T20:59:48.9866804Z parm: NVreg_UsePageAttributeTable:int 2024-08-08T20:59:48.9867275Z parm: NVreg_EnablePCIeGen3:int 2024-08-08T20:59:48.9867705Z parm: NVreg_EnableMSI:int 2024-08-08T20:59:48.9868117Z parm: NVreg_TCEBypassMode:int 2024-08-08T20:59:48.9868563Z parm: NVreg_EnableStreamMemOPs:int 2024-08-08T20:59:48.9869088Z parm: NVreg_RestrictProfilingToAdminUsers:int 2024-08-08T20:59:48.9869675Z parm: NVreg_PreserveVideoMemoryAllocations:int 2024-08-08T20:59:48.9870223Z parm: NVreg_EnableS0ixPowerManagement:int 2024-08-08T20:59:48.9870823Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2024-08-08T20:59:48.9871415Z parm: NVreg_DynamicPowerManagement:int 2024-08-08T20:59:48.9872137Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2024-08-08T20:59:48.9872732Z parm: NVreg_EnableGpuFirmware:int 2024-08-08T20:59:48.9873216Z parm: NVreg_EnableGpuFirmwareLogs:int 2024-08-08T20:59:48.9873736Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2024-08-08T20:59:48.9874273Z parm: NVreg_EnableUserNUMAManagement:int 2024-08-08T20:59:48.9874761Z parm: NVreg_MemoryPoolSize:int 2024-08-08T20:59:48.9875214Z parm: NVreg_KMallocHeapMaxSize:int 2024-08-08T20:59:48.9875688Z parm: NVreg_VMallocHeapMaxSize:int 2024-08-08T20:59:48.9876156Z parm: NVreg_IgnoreMMIOCheck:int 2024-08-08T20:59:48.9876595Z parm: NVreg_NvLinkDisable:int 2024-08-08T20:59:48.9877093Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2024-08-08T20:59:48.9877624Z parm: NVreg_RegisterPCIDriver:int 2024-08-08T20:59:48.9878091Z parm: NVreg_EnableResizableBar:int 2024-08-08T20:59:48.9878573Z parm: NVreg_EnableDbgBreakpoint:int 2024-08-08T20:59:48.9879064Z parm: NVreg_EnableNonblockingOpen:int 2024-08-08T20:59:48.9879634Z parm: NVreg_RegistryDwords:charp 2024-08-08T20:59:48.9880129Z parm: NVreg_RegistryDwordsPerDevice:charp 2024-08-08T20:59:48.9880603Z parm: NVreg_RmMsg:charp 2024-08-08T20:59:48.9881010Z parm: NVreg_GpuBlacklist:charp 2024-08-08T20:59:48.9881465Z parm: NVreg_TemporaryFilePath:charp 2024-08-08T20:59:48.9881934Z parm: NVreg_ExcludedGpus:charp 2024-08-08T20:59:48.9882383Z parm: NVreg_DmaRemapPeerMmio:int 2024-08-08T20:59:48.9882849Z parm: NVreg_RmNvlinkBandwidth:charp 2024-08-08T20:59:48.9883406Z parm: NVreg_ImexChannelCount:int 2024-08-08T20:59:48.9883857Z parm: rm_firmware_active:charp 2024-08-08T20:59:48.9884257Z + HAS_NVIDIA_DRIVER=0 2024-08-08T20:59:48.9884635Z ++ command -v nvidia-smi 2024-08-08T20:59:48.9885044Z + '[' -x /usr/bin/nvidia-smi ']' 2024-08-08T20:59:48.9885400Z + set +e 2024-08-08T20:59:48.9885905Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2024-08-08T20:59:49.0142112Z + INSTALLED_DRIVER_VERSION=550.54.15 2024-08-08T20:59:49.0142584Z + NVIDIA_SMI_STATUS=0 2024-08-08T20:59:49.0143084Z + '[' 0 -ne 0 ']' 2024-08-08T20:59:49.0143450Z + '[' 550.54.15 '!=' 550.54.15 ']' 2024-08-08T20:59:49.0143817Z + HAS_NVIDIA_DRIVER=1 2024-08-08T20:59:49.0144530Z + echo 'NVIDIA driver (550.54.15) has already been installed. Skipping NVIDIA driver installation' 2024-08-08T20:59:49.0145224Z + set -e 2024-08-08T20:59:49.0145519Z + '[' 1 -eq 0 ']' 2024-08-08T20:59:49.0146081Z NVIDIA driver (550.54.15) has already been installed. Skipping NVIDIA driver installation 2024-08-08T20:59:49.0146764Z + post_install_nvidia_driver_common 2024-08-08T20:59:49.0149295Z + sudo modprobe nvidia 2024-08-08T20:59:49.1621644Z + echo 'After installing NVIDIA driver' 2024-08-08T20:59:49.1622637Z + lspci 2024-08-08T20:59:49.1623226Z After installing NVIDIA driver 2024-08-08T20:59:49.1725474Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2024-08-08T20:59:49.1726507Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2024-08-08T20:59:49.1727548Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2024-08-08T20:59:49.1728348Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2024-08-08T20:59:49.1729282Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2024-08-08T20:59:49.1730121Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2024-08-08T20:59:49.1730851Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2024-08-08T20:59:49.1731665Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2024-08-08T20:59:49.1732250Z + lsmod 2024-08-08T20:59:49.1759109Z Module Size Used by 2024-08-08T20:59:49.1759514Z ib_core 454656 0 2024-08-08T20:59:49.1759965Z veth 36864 0 2024-08-08T20:59:49.1760455Z nvidia_modeset 1351680 0 2024-08-08T20:59:49.1760985Z video 65536 1 nvidia_modeset 2024-08-08T20:59:49.1761572Z wmi 36864 1 video 2024-08-08T20:59:49.1762036Z nvidia_uvm 4706304 0 2024-08-08T20:59:49.1762506Z nvidia 54071296 7 nvidia_uvm,nvidia_modeset 2024-08-08T20:59:49.1763143Z drm 602112 1 nvidia 2024-08-08T20:59:49.1763703Z drm_panel_orientation_quirks 28672 1 drm 2024-08-08T20:59:49.1764223Z backlight 24576 3 video,drm,nvidia_modeset 2024-08-08T20:59:49.1764716Z i2c_core 106496 2 nvidia,drm 2024-08-08T20:59:49.1765133Z xt_conntrack 16384 1 2024-08-08T20:59:49.1765513Z nft_chain_nat 16384 3 2024-08-08T20:59:49.1765885Z xt_MASQUERADE 20480 1 2024-08-08T20:59:49.1766334Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2024-08-08T20:59:49.1766819Z nf_conntrack_netlink 57344 0 2024-08-08T20:59:49.1767386Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2024-08-08T20:59:49.1768257Z nf_defrag_ipv6 24576 1 nf_conntrack 2024-08-08T20:59:49.1768722Z nf_defrag_ipv4 16384 1 nf_conntrack 2024-08-08T20:59:49.1769143Z xfrm_user 57344 1 2024-08-08T20:59:49.1769532Z xfrm_algo 16384 1 xfrm_user 2024-08-08T20:59:49.1769945Z xt_addrtype 16384 2 2024-08-08T20:59:49.1770308Z nft_compat 20480 4 2024-08-08T20:59:49.1770760Z nf_tables 307200 57 nft_compat,nft_chain_nat 2024-08-08T20:59:49.1771365Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2024-08-08T20:59:49.1772103Z br_netfilter 36864 0 2024-08-08T20:59:49.1772516Z bridge 307200 1 br_netfilter 2024-08-08T20:59:49.1772951Z stp 16384 1 bridge 2024-08-08T20:59:49.1773361Z llc 16384 2 bridge,stp 2024-08-08T20:59:49.1773779Z overlay 167936 0 2024-08-08T20:59:49.1774144Z tls 114688 0 2024-08-08T20:59:49.1774510Z nls_ascii 16384 1 2024-08-08T20:59:49.1774874Z nls_cp437 20480 1 2024-08-08T20:59:49.1775239Z sunrpc 692224 1 2024-08-08T20:59:49.1775593Z vfat 24576 1 2024-08-08T20:59:49.1775961Z fat 86016 1 vfat 2024-08-08T20:59:49.1776347Z ena 167936 0 2024-08-08T20:59:49.1776710Z ghash_clmulni_intel 16384 0 2024-08-08T20:59:49.1777098Z ptp 36864 1 ena 2024-08-08T20:59:49.1777549Z pps_core 24576 1 ptp 2024-08-08T20:59:49.1778060Z aesni_intel 393216 0 2024-08-08T20:59:49.1778524Z i8042 45056 0 2024-08-08T20:59:49.1779037Z serio 28672 3 i8042 2024-08-08T20:59:49.1779563Z crypto_simd 16384 1 aesni_intel 2024-08-08T20:59:49.1780218Z cryptd 28672 2 crypto_simd,ghash_clmulni_intel 2024-08-08T20:59:49.1780735Z button 24576 0 2024-08-08T20:59:49.1781098Z sch_fq_codel 20480 17 2024-08-08T20:59:49.1781472Z dm_mod 188416 0 2024-08-08T20:59:49.1781833Z fuse 163840 1 2024-08-08T20:59:49.1782185Z loop 36864 0 2024-08-08T20:59:49.1782558Z dax 45056 1 dm_mod 2024-08-08T20:59:49.1782954Z configfs 57344 1 2024-08-08T20:59:49.1783317Z dmi_sysfs 20480 0 2024-08-08T20:59:49.1783681Z crc32_pclmul 16384 0 2024-08-08T20:59:49.1784046Z crc32c_intel 24576 0 2024-08-08T20:59:49.1784401Z efivarfs 24576 1 2024-08-08T20:59:49.1784792Z + modinfo nvidia 2024-08-08T20:59:49.1785469Z filename: /lib/modules/6.1.94-99.176.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2024-08-08T20:59:49.1786160Z alias: char-major-195-* 2024-08-08T20:59:49.1786528Z version: 550.54.15 2024-08-08T20:59:49.1786867Z supported: external 2024-08-08T20:59:49.1787199Z license: NVIDIA 2024-08-08T20:59:49.1787563Z firmware: nvidia/550.54.15/gsp_tu10x.bin 2024-08-08T20:59:49.1788043Z firmware: nvidia/550.54.15/gsp_ga10x.bin 2024-08-08T20:59:49.1788526Z srcversion: 833721318DA517F0C2FEC97 2024-08-08T20:59:49.1788994Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2024-08-08T20:59:49.1789482Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2024-08-08T20:59:49.1789960Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2024-08-08T20:59:49.1790435Z depends: i2c-core,drm 2024-08-08T20:59:49.1790789Z retpoline: Y 2024-08-08T20:59:49.1791096Z name: nvidia 2024-08-08T20:59:49.1791692Z vermagic: 6.1.94-99.176.amzn2023.x86_64 SMP preempt mod_unload modversions 2024-08-08T20:59:49.1792503Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2024-08-08T20:59:49.1793137Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2024-08-08T20:59:49.1793710Z parm: NVreg_ResmanDebugLevel:int 2024-08-08T20:59:49.1794147Z parm: NVreg_RmLogonRC:int 2024-08-08T20:59:49.1794574Z parm: NVreg_ModifyDeviceFiles:int 2024-08-08T20:59:49.1795151Z parm: NVreg_DeviceFileUID:int 2024-08-08T20:59:49.1795582Z parm: NVreg_DeviceFileGID:int 2024-08-08T20:59:49.1796014Z parm: NVreg_DeviceFileMode:int 2024-08-08T20:59:49.1796528Z parm: NVreg_InitializeSystemMemoryAllocations:int 2024-08-08T20:59:49.1797083Z parm: NVreg_UsePageAttributeTable:int 2024-08-08T20:59:49.1797557Z parm: NVreg_EnablePCIeGen3:int 2024-08-08T20:59:49.1797977Z parm: NVreg_EnableMSI:int 2024-08-08T20:59:49.1798385Z parm: NVreg_TCEBypassMode:int 2024-08-08T20:59:49.1798928Z parm: NVreg_EnableStreamMemOPs:int 2024-08-08T20:59:49.1799457Z parm: NVreg_RestrictProfilingToAdminUsers:int 2024-08-08T20:59:49.1800033Z parm: NVreg_PreserveVideoMemoryAllocations:int 2024-08-08T20:59:49.1800587Z parm: NVreg_EnableS0ixPowerManagement:int 2024-08-08T20:59:49.1801179Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2024-08-08T20:59:49.1801775Z parm: NVreg_DynamicPowerManagement:int 2024-08-08T20:59:49.1802379Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2024-08-08T20:59:49.1802956Z parm: NVreg_EnableGpuFirmware:int 2024-08-08T20:59:49.1803451Z parm: NVreg_EnableGpuFirmwareLogs:int 2024-08-08T20:59:49.1803974Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2024-08-08T20:59:49.1804506Z parm: NVreg_EnableUserNUMAManagement:int 2024-08-08T20:59:49.1804989Z parm: NVreg_MemoryPoolSize:int 2024-08-08T20:59:49.1805448Z parm: NVreg_KMallocHeapMaxSize:int 2024-08-08T20:59:49.1805937Z parm: NVreg_VMallocHeapMaxSize:int 2024-08-08T20:59:49.1806394Z parm: NVreg_IgnoreMMIOCheck:int 2024-08-08T20:59:49.1806840Z parm: NVreg_NvLinkDisable:int 2024-08-08T20:59:49.1807336Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2024-08-08T20:59:49.1807845Z parm: NVreg_RegisterPCIDriver:int 2024-08-08T20:59:49.1808326Z parm: NVreg_EnableResizableBar:int 2024-08-08T20:59:49.1808811Z parm: NVreg_EnableDbgBreakpoint:int 2024-08-08T20:59:49.1809297Z parm: NVreg_EnableNonblockingOpen:int 2024-08-08T20:59:49.1809777Z parm: NVreg_RegistryDwords:charp 2024-08-08T20:59:49.1810269Z parm: NVreg_RegistryDwordsPerDevice:charp 2024-08-08T20:59:49.1810732Z parm: NVreg_RmMsg:charp 2024-08-08T20:59:49.1811139Z parm: NVreg_GpuBlacklist:charp 2024-08-08T20:59:49.1811600Z parm: NVreg_TemporaryFilePath:charp 2024-08-08T20:59:49.1812053Z parm: NVreg_ExcludedGpus:charp 2024-08-08T20:59:49.1812512Z parm: NVreg_DmaRemapPeerMmio:int 2024-08-08T20:59:49.1812987Z parm: NVreg_RmNvlinkBandwidth:charp 2024-08-08T20:59:49.1813448Z parm: NVreg_ImexChannelCount:int 2024-08-08T20:59:49.1813896Z parm: rm_firmware_active:charp 2024-08-08T20:59:49.1814292Z + set +e 2024-08-08T20:59:49.1814594Z + nvidia-smi 2024-08-08T20:59:49.1989693Z Thu Aug 8 20:59:49 2024 2024-08-08T20:59:49.1990458Z +-----------------------------------------------------------------------------------------+ 2024-08-08T20:59:49.1991346Z | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | 2024-08-08T20:59:49.1992218Z |-----------------------------------------+------------------------+----------------------+ 2024-08-08T20:59:49.1993008Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2024-08-08T20:59:49.1993880Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2024-08-08T20:59:49.1994569Z | | | MIG M. | 2024-08-08T20:59:49.1995116Z |=========================================+========================+======================| 2024-08-08T20:59:49.2215455Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2024-08-08T20:59:49.2216342Z | 0% 32C P8 16W / 300W | 0MiB / 23028MiB | 0% Default | 2024-08-08T20:59:49.2216969Z | | | N/A | 2024-08-08T20:59:49.2217660Z +-----------------------------------------+------------------------+----------------------+ 2024-08-08T20:59:49.2220536Z 2024-08-08T20:59:49.2221215Z +-----------------------------------------------------------------------------------------+ 2024-08-08T20:59:49.2221989Z | Processes: | 2024-08-08T20:59:49.2222689Z | GPU GI CI PID Type Process name GPU Memory | 2024-08-08T20:59:49.2223372Z | ID ID Usage | 2024-08-08T20:59:49.2223928Z |=========================================================================================| 2024-08-08T20:59:49.2227347Z | No running processes found | 2024-08-08T20:59:49.2228118Z +-----------------------------------------------------------------------------------------+ 2024-08-08T20:59:49.5188650Z + nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 2024-08-08T20:59:49.5380390Z NVIDIA A10G 2024-08-08T20:59:49.5446975Z + NVIDIA_SMI_STATUS=0 2024-08-08T20:59:49.5447528Z + '[' 0 -eq 0 ']' 2024-08-08T20:59:49.5448053Z + echo 'INFO: Ignoring allowed status 0' 2024-08-08T20:59:49.5448481Z + set -e 2024-08-08T20:59:49.5448805Z INFO: Ignoring allowed status 0 2024-08-08T20:59:49.5457571Z == Installing nvidia container toolkit for amzn2023 == 2024-08-08T20:59:49.5461436Z + sudo yum install -y yum-utils 2024-08-08T20:59:50.0084900Z Last metadata expiration check: 0:53:29 ago on Thu Aug 8 20:06:20 2024. 2024-08-08T20:59:50.0306327Z Package dnf-utils-4.3.0-13.amzn2023.0.4.noarch is already installed. 2024-08-08T20:59:50.0609566Z Dependencies resolved. 2024-08-08T20:59:50.0735156Z Nothing to do. 2024-08-08T20:59:50.0735489Z Complete! 2024-08-08T20:59:50.1655945Z + [[ amzn2023 == \a\m\z\n\2\0\2\3 ]] 2024-08-08T20:59:50.1657156Z + YUM_REPO_URL=https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2024-08-08T20:59:50.1658945Z + sudo yum-config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2024-08-08T20:59:50.4417963Z Adding repo from: https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2024-08-08T20:59:50.5114960Z + sudo yum install -y nvidia-docker2 2024-08-08T20:59:50.9945652Z nvidia-container-toolkit 10 kB/s | 833 B 00:00 2024-08-08T20:59:51.0165912Z Package nvidia-docker2-2.14.0-1.noarch is already installed. 2024-08-08T20:59:51.0465560Z Dependencies resolved. 2024-08-08T20:59:51.0590495Z Nothing to do. 2024-08-08T20:59:51.0591107Z Complete! 2024-08-08T20:59:51.1522641Z + sudo systemctl restart docker 2024-08-08T21:00:32.8381907Z nvidia-persistenced failed to initialize. Check syslog for more details. 2024-08-08T21:00:32.8619591Z Thu Aug 8 21:00:32 2024 2024-08-08T21:00:32.8620417Z +-----------------------------------------------------------------------------------------+ 2024-08-08T21:00:32.8621242Z | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | 2024-08-08T21:00:32.8622030Z |-----------------------------------------+------------------------+----------------------+ 2024-08-08T21:00:32.8622831Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2024-08-08T21:00:32.8623859Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2024-08-08T21:00:32.8624572Z | | | MIG M. | 2024-08-08T21:00:32.8625104Z |=========================================+========================+======================| 2024-08-08T21:00:32.8828695Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2024-08-08T21:00:32.8829402Z | 0% 32C P8 16W / 300W | 0MiB / 23028MiB | 0% Default | 2024-08-08T21:00:32.8830026Z | | | N/A | 2024-08-08T21:00:32.8830717Z +-----------------------------------------+------------------------+----------------------+ 2024-08-08T21:00:32.8834025Z 2024-08-08T21:00:32.8835113Z +-----------------------------------------------------------------------------------------+ 2024-08-08T21:00:32.8835749Z | Processes: | 2024-08-08T21:00:32.8836435Z | GPU GI CI PID Type Process name GPU Memory | 2024-08-08T21:00:32.8837120Z | ID ID Usage | 2024-08-08T21:00:32.8837677Z |=========================================================================================| 2024-08-08T21:00:32.8839146Z | No running processes found | 2024-08-08T21:00:32.8839919Z +-----------------------------------------------------------------------------------------+ 2024-08-08T21:00:33.5058775Z Command completed after 1 attempt(s). 2024-08-08T21:00:33.5140512Z ##[group]Run python3 -m pip install psutil==5.9.1 nvidia-ml-py==11.525.84 2024-08-08T21:00:33.5141266Z python3 -m pip install psutil==5.9.1 nvidia-ml-py==11.525.84 2024-08-08T21:00:33.5141941Z python3 -m tools.stats.monitor > usage_log.txt 2>&1 & 2024-08-08T21:00:33.5142582Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2024-08-08T21:00:33.5154890Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:00:33.5155400Z env: 2024-08-08T21:00:33.5155671Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:00:33.5156121Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:00:33.5156597Z ##[endgroup] 2024-08-08T21:00:33.7582678Z Defaulting to user installation because normal site-packages is not writeable 2024-08-08T21:00:33.7751469Z Requirement already satisfied: psutil==5.9.1 in /home/ec2-user/.local/lib/python3.9/site-packages (5.9.1) 2024-08-08T21:00:33.7756798Z Requirement already satisfied: nvidia-ml-py==11.525.84 in /home/ec2-user/.local/lib/python3.9/site-packages (11.525.84) 2024-08-08T21:00:33.8999467Z Prepare all required actions 2024-08-08T21:00:33.8999959Z Getting action download info 2024-08-08T21:00:34.0027932Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2024-08-08T21:00:34.1869245Z Download action repository 'actions/download-artifact@v3' (SHA:9bc31d5ccc31df68ecc42ccf4149144866c47d8a) 2024-08-08T21:00:34.3269418Z ##[group]Run ./.github/actions/download-build-artifacts 2024-08-08T21:00:34.3269891Z with: 2024-08-08T21:00:34.3270215Z name: linux-focal-cuda12.4-py3.10-gcc9-sm86 2024-08-08T21:00:34.3270660Z s3-bucket: gha-artifacts 2024-08-08T21:00:34.3270990Z env: 2024-08-08T21:00:34.3271250Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:00:34.3271689Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:00:34.3272269Z ##[endgroup] 2024-08-08T21:00:34.3305820Z ##[group]Run seemethere/download-artifact-s3@v4 2024-08-08T21:00:34.3306252Z with: 2024-08-08T21:00:34.3306623Z name: linux-focal-cuda12.4-py3.10-gcc9-sm86 2024-08-08T21:00:34.3307068Z s3-bucket: gha-artifacts 2024-08-08T21:00:34.3307424Z region: us-east-1 2024-08-08T21:00:34.3307716Z env: 2024-08-08T21:00:34.3307981Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:00:34.3308424Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:00:34.3308901Z ##[endgroup] 2024-08-08T21:00:34.8238618Z (node:250141) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2024-08-08T21:00:34.8239711Z 2024-08-08T21:00:34.8239973Z Please migrate your code to use AWS SDK for JavaScript (v3). 2024-08-08T21:00:34.8240683Z For more information, check the migration guide at https://a.co/7PzMCcy 2024-08-08T21:00:34.8241567Z (Use `node --trace-warnings ...` to show where the warning was created) 2024-08-08T21:00:34.8941093Z Found 1 objects with prefix pytorch/pytorch/10308807531/linux-focal-cuda12.4-py3.10-gcc9-sm86/ 2024-08-08T21:00:34.8943216Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2024-08-08T21:00:52.8635218Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2024-08-08T21:00:52.8642165Z Artifact download has finished successfully 2024-08-08T21:00:52.9006847Z ##[group]Run unzip -o artifacts.zip 2024-08-08T21:00:52.9007286Z unzip -o artifacts.zip 2024-08-08T21:00:52.9017125Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:00:52.9017622Z env: 2024-08-08T21:00:52.9017905Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:00:52.9018350Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:00:52.9018819Z ##[endgroup] 2024-08-08T21:00:52.9065436Z Archive: artifacts.zip 2024-08-08T21:00:52.9066867Z creating: dist/ 2024-08-08T21:00:55.0352733Z inflating: dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl 2024-08-08T21:00:55.0353414Z creating: build/custom_test_artifacts/ 2024-08-08T21:00:55.0354005Z creating: build/custom_test_artifacts/custom-op-build/ 2024-08-08T21:00:55.0354750Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2024-08-08T21:00:55.0355585Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2024-08-08T21:00:55.0362688Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2024-08-08T21:00:55.0365180Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/ 2024-08-08T21:00:55.0366227Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-08-08T21:00:55.0367492Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-08-08T21:00:55.0368495Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-08-08T21:00:55.0369946Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-08-08T21:00:55.0371441Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-08-08T21:00:55.0372575Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-08-08T21:00:55.0373695Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-08-08T21:00:55.0375019Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-08-08T21:00:55.0376355Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-08-08T21:00:55.0377711Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-08-08T21:00:55.0378981Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-08-08T21:00:55.0380360Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-08-08T21:00:55.0381691Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-08-08T21:00:55.0382787Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/ 2024-08-08T21:00:55.0383786Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/ 2024-08-08T21:00:55.0424287Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2024-08-08T21:00:55.0465682Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2024-08-08T21:00:55.0467149Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2024-08-08T21:00:55.0513823Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2024-08-08T21:00:55.0515662Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2024-08-08T21:00:55.0517412Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2024-08-08T21:00:55.0519187Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2024-08-08T21:00:55.0520827Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2024-08-08T21:00:55.0522457Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2024-08-08T21:00:55.0524160Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2024-08-08T21:00:55.0525777Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2024-08-08T21:00:55.0527297Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2024-08-08T21:00:55.0528724Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2024-08-08T21:00:55.0530020Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.reg.c 2024-08-08T21:00:55.0531236Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin 2024-08-08T21:00:55.0532453Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2024-08-08T21:00:55.0533622Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.o 2024-08-08T21:00:55.0534813Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/CMakeCUDACompilerId.cu 2024-08-08T21:00:55.0603292Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CompilerIdCUDA/a.out 2024-08-08T21:00:55.0677231Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CUDA.bin 2024-08-08T21:00:55.0678590Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.26.4/CMakeCUDACompiler.cmake 2024-08-08T21:00:55.0679734Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2024-08-08T21:00:55.0680835Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2024-08-08T21:00:55.0681922Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2024-08-08T21:00:55.0683042Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2024-08-08T21:00:55.0684169Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2024-08-08T21:00:55.0685540Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2024-08-08T21:00:55.0686835Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2024-08-08T21:00:55.0688078Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2024-08-08T21:00:55.0689142Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2024-08-08T21:00:55.0690408Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2024-08-08T21:00:55.0691473Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2024-08-08T21:00:55.0692530Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2024-08-08T21:00:55.0693582Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2024-08-08T21:00:55.0709297Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2024-08-08T21:00:55.0854153Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2024-08-08T21:00:55.0856662Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2024-08-08T21:00:55.0859291Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2024-08-08T21:00:55.0860700Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2024-08-08T21:00:55.0861872Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2024-08-08T21:00:55.0862951Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2024-08-08T21:00:55.0864060Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2024-08-08T21:00:55.0865171Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2024-08-08T21:00:55.0866288Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2024-08-08T21:00:55.0867399Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2024-08-08T21:00:55.0868496Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2024-08-08T21:00:55.0882157Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2024-08-08T21:00:55.0965766Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2024-08-08T21:00:55.0967222Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-08-08T21:00:55.0968429Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2024-08-08T21:00:55.0969748Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2024-08-08T21:00:55.0970895Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2024-08-08T21:00:55.0971841Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2024-08-08T21:00:55.0972729Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2024-08-08T21:00:55.0974100Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2024-08-08T21:00:55.0975198Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2024-08-08T21:00:55.0976309Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2024-08-08T21:00:55.1099256Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2024-08-08T21:00:55.1161874Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2024-08-08T21:00:55.1162694Z creating: build/custom_test_artifacts/jit-hook-build/ 2024-08-08T21:00:55.1163532Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2024-08-08T21:00:55.1164345Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2024-08-08T21:00:55.1171310Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2024-08-08T21:00:55.1172409Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/ 2024-08-08T21:00:55.1173738Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-08-08T21:00:55.1174711Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-08-08T21:00:55.1175776Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-08-08T21:00:55.1176955Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-08-08T21:00:55.1178390Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-08-08T21:00:55.1179419Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-08-08T21:00:55.1180397Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-08-08T21:00:55.1182816Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-08-08T21:00:55.1184251Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-08-08T21:00:55.1186419Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-08-08T21:00:55.1187539Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-08-08T21:00:55.1189135Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-08-08T21:00:55.1190608Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-08-08T21:00:55.1191635Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/ 2024-08-08T21:00:55.1192669Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/ 2024-08-08T21:00:55.1234545Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2024-08-08T21:00:55.1275468Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2024-08-08T21:00:55.1276885Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2024-08-08T21:00:55.1324594Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2024-08-08T21:00:55.1325978Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2024-08-08T21:00:55.1327813Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2024-08-08T21:00:55.1329280Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2024-08-08T21:00:55.1330717Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2024-08-08T21:00:55.1332182Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2024-08-08T21:00:55.1333766Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2024-08-08T21:00:55.1335337Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2024-08-08T21:00:55.1336879Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2024-08-08T21:00:55.1338326Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2024-08-08T21:00:55.1339725Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.reg.c 2024-08-08T21:00:55.1341098Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin 2024-08-08T21:00:55.1342444Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2024-08-08T21:00:55.1343612Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.o 2024-08-08T21:00:55.1344802Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/CMakeCUDACompilerId.cu 2024-08-08T21:00:55.1416013Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CompilerIdCUDA/a.out 2024-08-08T21:00:55.1490138Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CUDA.bin 2024-08-08T21:00:55.1491290Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.26.4/CMakeCUDACompiler.cmake 2024-08-08T21:00:55.1492265Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2024-08-08T21:00:55.1493109Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2024-08-08T21:00:55.1493992Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2024-08-08T21:00:55.1494927Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2024-08-08T21:00:55.1495970Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2024-08-08T21:00:55.1497139Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2024-08-08T21:00:55.1498264Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2024-08-08T21:00:55.1499317Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2024-08-08T21:00:55.1500408Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2024-08-08T21:00:55.1501497Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2024-08-08T21:00:55.1502593Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2024-08-08T21:00:55.1503683Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2024-08-08T21:00:55.1504765Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2024-08-08T21:00:55.1525940Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2024-08-08T21:00:55.1590690Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2024-08-08T21:00:55.1592494Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-08-08T21:00:55.1594040Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2024-08-08T21:00:55.1595412Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2024-08-08T21:00:55.1596688Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2024-08-08T21:00:55.1597930Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2024-08-08T21:00:55.1599232Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2024-08-08T21:00:55.1601401Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2024-08-08T21:00:55.1602805Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2024-08-08T21:00:55.1604139Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2024-08-08T21:00:55.1655145Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2024-08-08T21:00:55.1656199Z creating: build/custom_test_artifacts/custom-backend-build/ 2024-08-08T21:00:55.1657281Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2024-08-08T21:00:55.1658552Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2024-08-08T21:00:55.1665560Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2024-08-08T21:00:55.1666924Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/ 2024-08-08T21:00:55.1668321Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeSystem.cmake 2024-08-08T21:00:55.1669844Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/ 2024-08-08T21:00:55.1671272Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/tmp/ 2024-08-08T21:00:55.1673013Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/CMakeCCompilerId.c 2024-08-08T21:00:55.1674623Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdC/a.out 2024-08-08T21:00:55.1676177Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/ 2024-08-08T21:00:55.1677645Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/tmp/ 2024-08-08T21:00:55.1679366Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/CMakeCXXCompilerId.cpp 2024-08-08T21:00:55.1681092Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCXX/a.out 2024-08-08T21:00:55.1682675Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_C.bin 2024-08-08T21:00:55.1684349Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCCompiler.cmake 2024-08-08T21:00:55.1686207Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CXX.bin 2024-08-08T21:00:55.1687969Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCXXCompiler.cmake 2024-08-08T21:00:55.1689091Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/ 2024-08-08T21:00:55.1690148Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/ 2024-08-08T21:00:55.1732590Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2024-08-08T21:00:55.1774793Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2024-08-08T21:00:55.1777243Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2024-08-08T21:00:55.1824391Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2024-08-08T21:00:55.1826444Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2024-08-08T21:00:55.1828508Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2024-08-08T21:00:55.1830623Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2024-08-08T21:00:55.1832831Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2024-08-08T21:00:55.1834853Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2024-08-08T21:00:55.1836893Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2024-08-08T21:00:55.1838903Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2024-08-08T21:00:55.1840868Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2024-08-08T21:00:55.1842971Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2024-08-08T21:00:55.1844899Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.reg.c 2024-08-08T21:00:55.1846768Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin 2024-08-08T21:00:55.1848586Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2024-08-08T21:00:55.1850385Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/tmp/a_dlink.o 2024-08-08T21:00:55.1852251Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/CMakeCUDACompilerId.cu 2024-08-08T21:00:55.1920569Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CompilerIdCUDA/a.out 2024-08-08T21:00:55.1994933Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeDetermineCompilerABI_CUDA.bin 2024-08-08T21:00:55.1996656Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.26.4/CMakeCUDACompiler.cmake 2024-08-08T21:00:55.1998118Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2024-08-08T21:00:55.1999432Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2024-08-08T21:00:55.2000786Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2024-08-08T21:00:55.2002204Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2024-08-08T21:00:55.2003808Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2024-08-08T21:00:55.2005565Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2024-08-08T21:00:55.2007241Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2024-08-08T21:00:55.2008860Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2024-08-08T21:00:55.2010568Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2024-08-08T21:00:55.2012398Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2024-08-08T21:00:55.2014084Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2024-08-08T21:00:55.2016082Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2024-08-08T21:00:55.2017830Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2024-08-08T21:00:55.2019579Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2024-08-08T21:00:55.2143603Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2024-08-08T21:00:55.2145245Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2024-08-08T21:00:55.2146919Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2024-08-08T21:00:55.2148762Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2024-08-08T21:00:55.2150536Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2024-08-08T21:00:55.2152417Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2024-08-08T21:00:55.2154419Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2024-08-08T21:00:55.2156222Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2024-08-08T21:00:55.2157994Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2024-08-08T21:00:55.2159770Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2024-08-08T21:00:55.2161510Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2024-08-08T21:00:55.2177315Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2024-08-08T21:00:55.2233733Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2024-08-08T21:00:55.2235566Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2024-08-08T21:00:55.2237162Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2024-08-08T21:00:55.2238633Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2024-08-08T21:00:55.2239980Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2024-08-08T21:00:55.2241352Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2024-08-08T21:00:55.2242738Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2024-08-08T21:00:55.2244019Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2024-08-08T21:00:55.2245199Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2024-08-08T21:00:55.2246352Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2024-08-08T21:00:55.2350399Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2024-08-08T21:00:55.2393995Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2024-08-08T21:00:55.2394763Z creating: build/lib/ 2024-08-08T21:00:55.2483784Z inflating: build/lib/libprotobuf-lite.a 2024-08-08T21:00:55.2939979Z inflating: build/lib/libprotobuf.a 2024-08-08T21:00:55.2949683Z inflating: build/lib/libpthreadpool.a 2024-08-08T21:00:55.2958331Z inflating: build/lib/libcpuinfo.a 2024-08-08T21:00:55.2966802Z inflating: build/lib/libcpuinfo_internals.a 2024-08-08T21:00:55.2967599Z inflating: build/lib/libclog.a 2024-08-08T21:00:55.2986399Z inflating: build/lib/libnnpack.a 2024-08-08T21:00:55.2989394Z inflating: build/lib/libnnpack_reference_layers.a 2024-08-08T21:00:55.3055530Z inflating: build/lib/libgtest.a 2024-08-08T21:00:55.3130512Z inflating: build/lib/libbenchmark.a 2024-08-08T21:00:55.3194735Z inflating: build/lib/libasmjit.a 2024-08-08T21:00:55.3202861Z inflating: build/lib/libittnotify.a 2024-08-08T21:00:55.3231406Z inflating: build/lib/libtensorpipe_uv.a 2024-08-08T21:00:55.3359233Z inflating: build/lib/libgloo.a 2024-08-08T21:00:55.3380319Z inflating: build/lib/libfmt.a 2024-08-08T21:00:55.3475469Z inflating: build/lib/libc10.so 2024-08-08T21:00:55.3477536Z inflating: build/lib/libcaffe2_nvrtc.so 2024-08-08T21:00:55.3478281Z inflating: build/lib/libfoxi_loader.a 2024-08-08T21:00:55.3480517Z inflating: build/lib/libtorch_global_deps.so 2024-08-08T21:00:55.3500204Z inflating: build/lib/libpytorch_qnnpack.a 2024-08-08T21:00:55.3519349Z inflating: build/lib/libgmock.a 2024-08-08T21:00:55.3521269Z inflating: build/lib/libgtest_main.a 2024-08-08T21:00:55.3522139Z inflating: build/lib/libbenchmark_main.a 2024-08-08T21:00:55.4027261Z inflating: build/lib/libprotoc.a 2024-08-08T21:00:55.4417789Z inflating: build/lib/libgloo_cuda.a 2024-08-08T21:00:56.4603288Z inflating: build/lib/libdnnl.a 2024-08-08T21:00:56.5177872Z inflating: build/lib/libtensorpipe.a 2024-08-08T21:00:56.5236254Z inflating: build/lib/libc10_cuda.so 2024-08-08T21:00:56.5237121Z inflating: build/lib/libgmock_main.a 2024-08-08T21:00:56.6497525Z inflating: build/lib/libfbgemm.a 2024-08-08T21:00:56.6754002Z inflating: build/lib/libtensorpipe_cuda.a 2024-08-08T21:00:56.7260845Z inflating: build/lib/libkineto.a 2024-08-08T21:00:56.7445174Z inflating: build/lib/libXNNPACK.a 2024-08-08T21:00:56.7488047Z inflating: build/lib/libonnx_proto.a 2024-08-08T21:00:56.8182219Z inflating: build/lib/libonnx.a 2024-08-08T21:00:59.2468634Z inflating: build/lib/libtorch_cpu.so 2024-08-08T21:00:59.2473315Z inflating: build/lib/libunbox_lib.a 2024-08-08T21:00:59.2477880Z inflating: build/lib/libshm.so 2024-08-08T21:01:01.3122702Z inflating: build/lib/libtorch_cuda.so 2024-08-08T21:01:01.3123899Z inflating: build/lib/libtorch.so 2024-08-08T21:01:01.3127428Z inflating: build/lib/libc10d_cuda_test.so 2024-08-08T21:01:02.1932429Z inflating: build/lib/libtorch_cuda_linalg.so 2024-08-08T21:01:02.3927827Z inflating: build/lib/libtorch_python.so 2024-08-08T21:01:02.4000068Z inflating: build/lib/libtorchbind_test.so 2024-08-08T21:01:02.4022102Z inflating: build/lib/libjitbackend_test.so 2024-08-08T21:01:02.4048281Z inflating: build/lib/libbackend_with_compiler.so 2024-08-08T21:01:02.4074834Z inflating: build/lib/libaoti_custom_ops.so 2024-08-08T21:01:02.4109842Z inflating: build/lib/libnnapi_backend.so 2024-08-08T21:01:02.4110477Z creating: build/bin/ 2024-08-08T21:01:02.4161997Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2024-08-08T21:01:02.4214097Z inflating: build/bin/c10_DeviceGuard_test 2024-08-08T21:01:02.4265553Z inflating: build/bin/c10_Device_test 2024-08-08T21:01:02.4325652Z inflating: build/bin/c10_DispatchKeySet_test 2024-08-08T21:01:02.4379668Z inflating: build/bin/c10_Scalar_test 2024-08-08T21:01:02.4429579Z inflating: build/bin/c10_StreamGuard_test 2024-08-08T21:01:02.4481651Z inflating: build/bin/c10_SymInt_test 2024-08-08T21:01:02.4537113Z inflating: build/bin/c10_InlineDeviceGuard_test 2024-08-08T21:01:02.4593678Z inflating: build/bin/c10_InlineStreamGuard_test 2024-08-08T21:01:02.4651149Z inflating: build/bin/c10_SizesAndStrides_test 2024-08-08T21:01:02.4723108Z inflating: build/bin/c10_cow_test 2024-08-08T21:01:02.4777025Z inflating: build/bin/c10_Bitset_test 2024-08-08T21:01:02.4827250Z inflating: build/bin/c10_ConstexprCrc_test 2024-08-08T21:01:02.4877705Z inflating: build/bin/c10_DeadlockDetection_test 2024-08-08T21:01:02.4929494Z inflating: build/bin/c10_Half_test 2024-08-08T21:01:02.4986721Z inflating: build/bin/c10_LeftRight_test 2024-08-08T21:01:02.5043051Z inflating: build/bin/c10_Metaprogramming_test 2024-08-08T21:01:02.5094011Z inflating: build/bin/c10_Synchronized_test 2024-08-08T21:01:02.5150846Z inflating: build/bin/c10_ThreadLocal_test 2024-08-08T21:01:02.5203434Z inflating: build/bin/c10_TypeIndex_test 2024-08-08T21:01:02.5255584Z inflating: build/bin/c10_TypeList_test 2024-08-08T21:01:02.5305475Z inflating: build/bin/c10_TypeTraits_test 2024-08-08T21:01:02.5358600Z inflating: build/bin/c10_accumulate_test 2024-08-08T21:01:02.5415349Z inflating: build/bin/c10_bfloat16_test 2024-08-08T21:01:02.5466447Z inflating: build/bin/c10_bit_cast_test 2024-08-08T21:01:02.5524051Z inflating: build/bin/c10_complex_math_test 2024-08-08T21:01:02.5580329Z inflating: build/bin/c10_complex_test 2024-08-08T21:01:02.5634281Z inflating: build/bin/c10_exception_test 2024-08-08T21:01:02.5685344Z inflating: build/bin/c10_flags_test 2024-08-08T21:01:02.5736721Z inflating: build/bin/c10_generic_math_test 2024-08-08T21:01:02.5902814Z inflating: build/bin/c10_intrusive_ptr_test 2024-08-08T21:01:02.5954392Z inflating: build/bin/c10_irange_test 2024-08-08T21:01:02.6008753Z inflating: build/bin/c10_lazy_test 2024-08-08T21:01:02.6067162Z inflating: build/bin/c10_logging_test 2024-08-08T21:01:02.6144029Z inflating: build/bin/c10_optional_test 2024-08-08T21:01:02.6207598Z inflating: build/bin/c10_ordered_preserving_dict_test 2024-08-08T21:01:02.6262462Z inflating: build/bin/c10_registry_test 2024-08-08T21:01:02.6414747Z inflating: build/bin/c10_small_vector_test 2024-08-08T21:01:02.6467351Z inflating: build/bin/c10_ssize_test 2024-08-08T21:01:02.6520534Z inflating: build/bin/c10_string_util_test 2024-08-08T21:01:02.6580115Z inflating: build/bin/c10_string_view_test 2024-08-08T21:01:02.6631391Z inflating: build/bin/c10_tempfile_test 2024-08-08T21:01:02.6688458Z inflating: build/bin/c10_typeid_test 2024-08-08T21:01:02.6738334Z inflating: build/bin/c10_intrusive_ptr_benchmark 2024-08-08T21:01:02.7188340Z inflating: build/bin/protoc-3.13.0.0 2024-08-08T21:01:02.7637181Z inflating: build/bin/protoc 2024-08-08T21:01:02.7691080Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_1_var_test 2024-08-08T21:01:02.7745191Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_stream 2024-08-08T21:01:02.7798000Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_from_2_processes 2024-08-08T21:01:02.7853005Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_thread_and_block_and_device 2024-08-08T21:01:02.7906582Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_blocks_and_threads 2024-08-08T21:01:02.7961037Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_multiple_blocks 2024-08-08T21:01:02.8010907Z inflating: build/bin/c10_cuda_CUDATest 2024-08-08T21:01:02.8065571Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_same_block 2024-08-08T21:01:02.8397919Z inflating: build/bin/vec_test_all_types_DEFAULT 2024-08-08T21:01:02.8747722Z inflating: build/bin/vec_test_all_types_AVX512 2024-08-08T21:01:02.9108653Z inflating: build/bin/vec_test_all_types_AVX2 2024-08-08T21:01:02.9162470Z inflating: build/bin/BackoffTest 2024-08-08T21:01:02.9216554Z inflating: build/bin/FileStoreTest 2024-08-08T21:01:02.9273722Z inflating: build/bin/TCPStoreTest 2024-08-08T21:01:02.9328163Z inflating: build/bin/HashStoreTest 2024-08-08T21:01:02.9342428Z inflating: build/bin/ProcessGroupMPITest 2024-08-08T21:01:02.9397530Z inflating: build/bin/test_edge_op_registration 2024-08-08T21:01:02.9402443Z inflating: build/bin/torch_shm_manager 2024-08-08T21:01:02.9406321Z inflating: build/bin/example_allreduce 2024-08-08T21:01:02.9462018Z inflating: build/bin/test_dist_autograd 2024-08-08T21:01:02.9532389Z inflating: build/bin/test_cpp_rpc 2024-08-08T21:01:02.9535669Z inflating: build/bin/parallel_benchmark 2024-08-08T21:01:02.9603905Z inflating: build/bin/test_mobile_nnc 2024-08-08T21:01:02.9613448Z inflating: build/bin/aot_model_compiler_test 2024-08-08T21:01:02.9962867Z inflating: build/bin/test_lazy 2024-08-08T21:01:03.1149511Z inflating: build/bin/test_api 2024-08-08T21:01:03.1224030Z inflating: build/bin/Dict_test 2024-08-08T21:01:03.1276911Z inflating: build/bin/Dimname_test 2024-08-08T21:01:03.1342589Z inflating: build/bin/MaybeOwned_test 2024-08-08T21:01:03.1400820Z inflating: build/bin/NamedTensor_test 2024-08-08T21:01:03.1461122Z inflating: build/bin/apply_utils_test 2024-08-08T21:01:03.1521144Z inflating: build/bin/atest 2024-08-08T21:01:03.1584937Z inflating: build/bin/basic 2024-08-08T21:01:03.1641022Z inflating: build/bin/broadcast_test 2024-08-08T21:01:03.1692404Z inflating: build/bin/cpu_allocator_test 2024-08-08T21:01:03.1752319Z inflating: build/bin/cpu_generator_test 2024-08-08T21:01:03.1806416Z inflating: build/bin/cpu_profiling_allocator_test 2024-08-08T21:01:03.1900850Z inflating: build/bin/cpu_rng_test 2024-08-08T21:01:03.1952590Z inflating: build/bin/dispatch_key_set_test 2024-08-08T21:01:03.2003747Z inflating: build/bin/dlconvertor_test 2024-08-08T21:01:03.2063892Z inflating: build/bin/extension_backend_test 2024-08-08T21:01:03.2120023Z inflating: build/bin/half_test 2024-08-08T21:01:03.2217510Z inflating: build/bin/ivalue_test 2024-08-08T21:01:03.2267908Z inflating: build/bin/lazy_tensor_test 2024-08-08T21:01:03.2323341Z inflating: build/bin/math_kernel_test 2024-08-08T21:01:03.2378614Z inflating: build/bin/memory_format_test 2024-08-08T21:01:03.2432936Z inflating: build/bin/memory_overlapping_test 2024-08-08T21:01:03.2486848Z inflating: build/bin/mobile_memory_cleanup 2024-08-08T21:01:03.2543885Z inflating: build/bin/native_test 2024-08-08T21:01:03.2595392Z inflating: build/bin/operator_name_test 2024-08-08T21:01:03.2648164Z inflating: build/bin/operators_test 2024-08-08T21:01:03.2700835Z inflating: build/bin/packedtensoraccessor_test 2024-08-08T21:01:03.2769107Z inflating: build/bin/pow_test 2024-08-08T21:01:03.2827601Z inflating: build/bin/quantized_test 2024-08-08T21:01:03.2878603Z inflating: build/bin/reduce_ops_test 2024-08-08T21:01:03.2930773Z inflating: build/bin/reportMemoryUsage_test 2024-08-08T21:01:03.2988425Z inflating: build/bin/scalar_tensor_test 2024-08-08T21:01:03.3048078Z inflating: build/bin/scalar_test 2024-08-08T21:01:03.3101436Z inflating: build/bin/StorageUtils_test 2024-08-08T21:01:03.3155213Z inflating: build/bin/stride_properties_test 2024-08-08T21:01:03.3235428Z inflating: build/bin/tensor_iterator_test 2024-08-08T21:01:03.3290423Z inflating: build/bin/test_parallel 2024-08-08T21:01:03.3294098Z inflating: build/bin/thread_init_test 2024-08-08T21:01:03.3350939Z inflating: build/bin/type_ptr_test 2024-08-08T21:01:03.3411675Z inflating: build/bin/type_test 2024-08-08T21:01:03.3464993Z inflating: build/bin/undefined_tensor_test 2024-08-08T21:01:03.3466664Z inflating: build/bin/verify_api_visibility 2024-08-08T21:01:03.3538303Z inflating: build/bin/legacy_vmap_test 2024-08-08T21:01:03.3590757Z inflating: build/bin/weakref_test 2024-08-08T21:01:03.3643161Z inflating: build/bin/wrapdim_test 2024-08-08T21:01:03.3695518Z inflating: build/bin/xla_tensor_test 2024-08-08T21:01:03.3756888Z inflating: build/bin/IListRef_test 2024-08-08T21:01:03.3864502Z inflating: build/bin/List_test 2024-08-08T21:01:03.3932193Z inflating: build/bin/KernelFunction_test 2024-08-08T21:01:03.4055588Z inflating: build/bin/kernel_function_legacy_test 2024-08-08T21:01:03.4153624Z inflating: build/bin/kernel_function_test 2024-08-08T21:01:03.4284000Z inflating: build/bin/kernel_lambda_legacy_test 2024-08-08T21:01:03.4388968Z inflating: build/bin/kernel_lambda_test 2024-08-08T21:01:03.4452255Z inflating: build/bin/kernel_stackbased_test 2024-08-08T21:01:03.4549964Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2024-08-08T21:01:03.4601841Z inflating: build/bin/CppSignature_test 2024-08-08T21:01:03.4658717Z inflating: build/bin/backend_fallback_test 2024-08-08T21:01:03.4708408Z inflating: build/bin/op_allowlist_test 2024-08-08T21:01:03.4773371Z inflating: build/bin/inline_container_test 2024-08-08T21:01:03.5078966Z inflating: build/bin/op_registration_test 2024-08-08T21:01:03.5133231Z inflating: build/bin/cuda_apply_test 2024-08-08T21:01:03.5185850Z inflating: build/bin/cuda_allocator_test 2024-08-08T21:01:03.5241796Z inflating: build/bin/cuda_caching_host_allocator_test 2024-08-08T21:01:03.5301647Z inflating: build/bin/cuda_atomic_ops_test 2024-08-08T21:01:03.5372982Z inflating: build/bin/cuda_complex_math_test 2024-08-08T21:01:03.5433689Z inflating: build/bin/cuda_complex_test 2024-08-08T21:01:03.5484002Z inflating: build/bin/cuda_device_test 2024-08-08T21:01:03.5543838Z inflating: build/bin/cuda_cub_test 2024-08-08T21:01:03.5595704Z inflating: build/bin/cuda_dlconvertor_test 2024-08-08T21:01:03.5661942Z inflating: build/bin/cuda_distributions_test 2024-08-08T21:01:03.5720079Z inflating: build/bin/cuda_generator_test 2024-08-08T21:01:03.5770834Z inflating: build/bin/cuda_half_test 2024-08-08T21:01:03.5823457Z inflating: build/bin/cuda_integer_divider_test 2024-08-08T21:01:03.5875358Z inflating: build/bin/cuda_optional_test 2024-08-08T21:01:03.5928540Z inflating: build/bin/cuda_packedtensoraccessor_test 2024-08-08T21:01:03.5982037Z inflating: build/bin/cuda_reportMemoryUsage_test 2024-08-08T21:01:03.6033199Z inflating: build/bin/cuda_allocatorTraceTracker_test 2024-08-08T21:01:03.6094723Z inflating: build/bin/cuda_stream_test 2024-08-08T21:01:03.6145173Z inflating: build/bin/cuda_cudnn_test 2024-08-08T21:01:03.6198036Z inflating: build/bin/cuda_vectorized_test 2024-08-08T21:01:03.6213017Z inflating: build/bin/tutorial_tensorexpr 2024-08-08T21:01:03.6280182Z inflating: build/bin/ProcessGroupGlooTest 2024-08-08T21:01:03.6338594Z inflating: build/bin/ProcessGroupGlooAsyncTest 2024-08-08T21:01:03.6403219Z inflating: build/bin/ProcessGroupNCCLTest 2024-08-08T21:01:03.6466550Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2024-08-08T21:01:03.7299066Z inflating: build/bin/test_tensorexpr 2024-08-08T21:01:03.7883089Z inflating: build/bin/test_jit 2024-08-08T21:01:03.7883701Z creating: .additional_ci_files/ 2024-08-08T21:01:03.7945423Z inflating: .additional_ci_files/test-times.json 2024-08-08T21:01:03.8186324Z inflating: .additional_ci_files/test-class-times.json 2024-08-08T21:01:03.8220496Z ##[group]Run rm artifacts.zip 2024-08-08T21:01:03.8220877Z rm artifacts.zip 2024-08-08T21:01:03.8230558Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:03.8231041Z env: 2024-08-08T21:01:03.8231315Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:03.8231759Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:03.8232368Z ##[endgroup] 2024-08-08T21:01:03.9463671Z ##[group]Run df -H 2024-08-08T21:01:03.9463982Z df -H 2024-08-08T21:01:03.9472909Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:03.9473392Z env: 2024-08-08T21:01:03.9473669Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:03.9474112Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:03.9474579Z ##[endgroup] 2024-08-08T21:01:03.9523893Z Filesystem Size Used Avail Use% Mounted on 2024-08-08T21:01:03.9524560Z devtmpfs 4.2M 0 4.2M 0% /dev 2024-08-08T21:01:03.9525180Z tmpfs 34G 0 34G 0% /dev/shm 2024-08-08T21:01:03.9525752Z tmpfs 14G 553k 14G 1% /run 2024-08-08T21:01:03.9527744Z /dev/nvme0n1p1 161G 38G 124G 24% / 2024-08-08T21:01:03.9528338Z tmpfs 34G 17k 34G 1% /tmp 2024-08-08T21:01:03.9528973Z /dev/nvme0n1p128 11M 1.4M 9.2M 13% /boot/efi 2024-08-08T21:01:03.9529482Z tmpfs 6.7G 0 6.7G 0% /run/user/0 2024-08-08T21:01:03.9568083Z Prepare all required actions 2024-08-08T21:01:03.9568509Z Getting action download info 2024-08-08T21:01:04.0732372Z ##[group]Run ./.github/actions/download-td-artifacts 2024-08-08T21:01:04.0732828Z with: 2024-08-08T21:01:04.0733081Z env: 2024-08-08T21:01:04.0733355Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:04.0733798Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:04.0734274Z ##[endgroup] 2024-08-08T21:01:04.0766998Z ##[group]Run seemethere/download-artifact-s3@v4 2024-08-08T21:01:04.0767437Z with: 2024-08-08T21:01:04.0767704Z name: td_results 2024-08-08T21:01:04.0768012Z s3-bucket: gha-artifacts 2024-08-08T21:01:04.0768361Z region: us-east-1 2024-08-08T21:01:04.0768655Z env: 2024-08-08T21:01:04.0768916Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:04.0769359Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:04.0769833Z ##[endgroup] 2024-08-08T21:01:04.5571569Z (node:250167) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2024-08-08T21:01:04.5572284Z 2024-08-08T21:01:04.5572610Z Please migrate your code to use AWS SDK for JavaScript (v3). 2024-08-08T21:01:04.5573791Z For more information, check the migration guide at https://a.co/7PzMCcy 2024-08-08T21:01:04.5574716Z (Use `node --trace-warnings ...` to show where the warning was created) 2024-08-08T21:01:04.6266638Z Found 1 objects with prefix pytorch/pytorch/10308807531/td_results/ 2024-08-08T21:01:04.6267607Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2024-08-08T21:01:04.6789630Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2024-08-08T21:01:04.6795954Z Artifact download has finished successfully 2024-08-08T21:01:04.7124448Z ##[group]Run mkdir -p .additional_ci_files 2024-08-08T21:01:04.7124932Z mkdir -p .additional_ci_files 2024-08-08T21:01:04.7125498Z mv td_results.json .additional_ci_files/td_results.json 2024-08-08T21:01:04.7135156Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:04.7135644Z env: 2024-08-08T21:01:04.7135940Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:04.7136389Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:04.7136855Z ##[endgroup] 2024-08-08T21:01:04.7229504Z ##[group]Run .github/scripts/parse_ref.py 2024-08-08T21:01:04.7230013Z .github/scripts/parse_ref.py 2024-08-08T21:01:04.7239101Z shell: /usr/bin/bash -e {0} 2024-08-08T21:01:04.7239449Z env: 2024-08-08T21:01:04.7239727Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:04.7240165Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:04.7240639Z ##[endgroup] 2024-08-08T21:01:04.7523072Z Prepare all required actions 2024-08-08T21:01:04.7561662Z ##[group]Run ./.github/actions/get-workflow-job-id 2024-08-08T21:01:04.7562107Z with: 2024-08-08T21:01:04.7562667Z github-token: *** 2024-08-08T21:01:04.7562966Z env: 2024-08-08T21:01:04.7563241Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:04.7563683Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:04.7564164Z ##[endgroup] 2024-08-08T21:01:04.7581018Z ##[group]Run set -eux 2024-08-08T21:01:04.7581354Z set -eux 2024-08-08T21:01:04.7581931Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2024-08-08T21:01:04.7590843Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:04.7591324Z env: 2024-08-08T21:01:04.7591598Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:04.7592122Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:04.7592749Z GITHUB_TOKEN: *** 2024-08-08T21:01:04.7593058Z ##[endgroup] 2024-08-08T21:01:04.7621564Z + python3 .github/scripts/get_workflow_job_id.py 10308807531 i-09ba204a38896644d 2024-08-08T21:01:05.9939229Z setting job-id=28538801350 2024-08-08T21:01:05.9940133Z setting job-name=linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T21:01:06.0154961Z Prepare all required actions 2024-08-08T21:01:06.0155413Z Getting action download info 2024-08-08T21:01:06.1287885Z ##[group]Run ./.github/actions/filter-test-configs 2024-08-08T21:01:06.1288333Z with: 2024-08-08T21:01:06.1288789Z github-token: *** 2024-08-08T21:01:06.1291021Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]} 2024-08-08T21:01:06.1293700Z job-name: linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T21:01:06.1294462Z env: 2024-08-08T21:01:06.1294765Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:06.1295205Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:06.1295941Z ##[endgroup] 2024-08-08T21:01:06.1339075Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2024-08-08T21:01:06.1339624Z with: 2024-08-08T21:01:06.1339894Z shell: bash 2024-08-08T21:01:06.1340182Z timeout_minutes: 10 2024-08-08T21:01:06.1340502Z max_attempts: 5 2024-08-08T21:01:06.1340812Z retry_wait_seconds: 30 2024-08-08T21:01:06.1341893Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2024-08-08T21:01:06.1343040Z polling_interval_seconds: 1 2024-08-08T21:01:06.1343408Z warning_on_retry: true 2024-08-08T21:01:06.1343762Z continue_on_error: false 2024-08-08T21:01:06.1344087Z env: 2024-08-08T21:01:06.1344358Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:06.1344808Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:06.1345431Z GITHUB_TOKEN: *** 2024-08-08T21:01:06.1345747Z ##[endgroup] 2024-08-08T21:01:06.2110213Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.1 2024-08-08T21:01:06.4473525Z Defaulting to user installation because normal site-packages is not writeable 2024-08-08T21:01:06.4649097Z Requirement already satisfied: requests==2.27.1 in /home/ec2-user/.local/lib/python3.9/site-packages (2.27.1) 2024-08-08T21:01:06.4652962Z Requirement already satisfied: pyyaml==6.0.1 in /home/ec2-user/.local/lib/python3.9/site-packages (6.0.1) 2024-08-08T21:01:06.4770226Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (2.10) 2024-08-08T21:01:06.4773896Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.9/site-packages (from requests==2.27.1) (2024.7.4) 2024-08-08T21:01:06.4782312Z Requirement already satisfied: charset-normalizer~=2.0.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from requests==2.27.1) (2.0.12) 2024-08-08T21:01:06.4787291Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (1.25.10) 2024-08-08T21:01:07.1901100Z Command completed after 1 attempt(s). 2024-08-08T21:01:07.1962148Z ##[group]Run set -x 2024-08-08T21:01:07.1962484Z set -x 2024-08-08T21:01:07.1962780Z  2024-08-08T21:01:07.1963315Z # Use relative path here as this could be checked out anywhere, not necessarily 2024-08-08T21:01:07.1963971Z # in runner workspace 2024-08-08T21:01:07.1964487Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2024-08-08T21:01:07.1974309Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:07.1974796Z env: 2024-08-08T21:01:07.1975073Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:07.1975523Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:07.1975992Z ##[endgroup] 2024-08-08T21:01:07.2010153Z + python3 /home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2024-08-08T21:01:07.2252676Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2024-08-08T21:01:07.2253206Z echo "Workflow: ${GITHUB_WORKFLOW}" 2024-08-08T21:01:07.2253711Z echo "Job name: ${JOB_NAME}" 2024-08-08T21:01:07.2254102Z  2024-08-08T21:01:07.2254633Z # Use relative path here as this could be checked out anywhere, not necessarily 2024-08-08T21:01:07.2255289Z # in runner workspace 2024-08-08T21:01:07.2255854Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2024-08-08T21:01:07.2256478Z  --workflow "${GITHUB_WORKFLOW}" \ 2024-08-08T21:01:07.2256927Z  --job-name "${JOB_NAME}" \ 2024-08-08T21:01:07.2259307Z  --test-matrix "{"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]}" \ 2024-08-08T21:01:07.2261868Z  --selected-test-configs "" \ 2024-08-08T21:01:07.2262319Z  --pr-number "${PR_NUMBER}" \ 2024-08-08T21:01:07.2262730Z  --tag "${TAG}" \ 2024-08-08T21:01:07.2263110Z  --event-name "${EVENT_NAME}" \ 2024-08-08T21:01:07.2263552Z  --schedule "${SCHEDULE}" \ 2024-08-08T21:01:07.2264017Z  --branch "${HEAD_BRANCH}" 2024-08-08T21:01:07.2272896Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:07.2273383Z env: 2024-08-08T21:01:07.2273659Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:07.2274110Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:07.2274767Z GITHUB_TOKEN: *** 2024-08-08T21:01:07.2275470Z JOB_NAME: linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T21:01:07.2276242Z PR_NUMBER: 2024-08-08T21:01:07.2276540Z TAG: ciflow/trunk/132710 2024-08-08T21:01:07.2276886Z EVENT_NAME: push 2024-08-08T21:01:07.2277192Z SCHEDULE: 2024-08-08T21:01:07.2277472Z HEAD_BRANCH: 2024-08-08T21:01:07.2277760Z ##[endgroup] 2024-08-08T21:01:07.2306899Z Workflow: trunk 2024-08-08T21:01:07.2307898Z Job name: linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T21:01:07.4588958Z INFO:root:Found no test-config label on the PR, so all test configs are included 2024-08-08T21:01:07.6206489Z ##[group]Run echo "Filtered matrix:" 2024-08-08T21:01:07.6206931Z echo "Filtered matrix:" 2024-08-08T21:01:07.6209247Z echo "{"include": [{"config": "default", "shard": 1, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 2, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 3, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 4, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}, {"config": "default", "shard": 5, "num_shards": 5, "runner": "amz2023.linux.g5.4xlarge.nvidia.gpu"}]}" 2024-08-08T21:01:07.6211540Z  2024-08-08T21:01:07.6211805Z echo 2024-08-08T21:01:07.6212176Z echo "Is the current job unstable? False" 2024-08-08T21:01:07.6212624Z  2024-08-08T21:01:07.6212886Z echo 2024-08-08T21:01:07.6213288Z echo "Is keep-going label set? False" 2024-08-08T21:01:07.6213725Z  2024-08-08T21:01:07.6213987Z echo 2024-08-08T21:01:07.6214298Z echo "Renabled issues? " 2024-08-08T21:01:07.6223735Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:07.6224226Z env: 2024-08-08T21:01:07.6224502Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:07.6224959Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:07.6225431Z ##[endgroup] 2024-08-08T21:01:07.6255681Z Filtered matrix: 2024-08-08T21:01:07.6258589Z {include: [{config: default, shard: 1, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 2, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 3, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 4, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}, {config: default, shard: 5, num_shards: 5, runner: amz2023.linux.g5.4xlarge.nvidia.gpu}]} 2024-08-08T21:01:07.6261114Z 2024-08-08T21:01:07.6261318Z Is the current job unstable? False 2024-08-08T21:01:07.6261901Z 2024-08-08T21:01:07.6262312Z Is keep-going label set? False 2024-08-08T21:01:07.6262574Z 2024-08-08T21:01:07.6262718Z Renabled issues? 2024-08-08T21:01:07.6305097Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2024-08-08T21:01:07.6305977Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2024-08-08T21:01:07.6315542Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T21:01:07.6316042Z env: 2024-08-08T21:01:07.6316321Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:07.6316760Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:07.6317238Z JOB_TIMEOUT: 240 2024-08-08T21:01:07.6317542Z ##[endgroup] 2024-08-08T21:01:07.6398385Z ##[group]Run set -x 2024-08-08T21:01:07.6398780Z set -x 2024-08-08T21:01:07.6399073Z  2024-08-08T21:01:07.6399426Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2024-08-08T21:01:07.6399960Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2024-08-08T21:01:07.6400515Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2024-08-08T21:01:07.6401022Z  TEST_COMMAND=.ci/onnx/test.sh 2024-08-08T21:01:07.6401426Z else 2024-08-08T21:01:07.6401766Z  TEST_COMMAND=.ci/pytorch/test.sh 2024-08-08T21:01:07.6402195Z fi 2024-08-08T21:01:07.6402463Z  2024-08-08T21:01:07.6402926Z # detached container should get cleaned up by teardown_ec2_linux 2024-08-08T21:01:07.6403677Z # TODO: Stop building test binaries as part of the build phase 2024-08-08T21:01:07.6404337Z # Used for GPU_FLAG since that doesn't play nice 2024-08-08T21:01:07.6404923Z # shellcheck disable=SC2086,SC2090 2024-08-08T21:01:07.6405381Z container_name=$(docker run \ 2024-08-08T21:01:07.6405796Z  ${GPU_FLAG:-} \ 2024-08-08T21:01:07.6406167Z  -e BUILD_ENVIRONMENT \ 2024-08-08T21:01:07.6406558Z  -e PR_NUMBER \ 2024-08-08T21:01:07.6406918Z  -e GITHUB_ACTIONS \ 2024-08-08T21:01:07.6407307Z  -e GITHUB_REPOSITORY \ 2024-08-08T21:01:07.6407704Z  -e GITHUB_WORKFLOW \ 2024-08-08T21:01:07.6408089Z  -e GITHUB_JOB \ 2024-08-08T21:01:07.6408449Z  -e GITHUB_RUN_ID \ 2024-08-08T21:01:07.6408837Z  -e GITHUB_RUN_NUMBER \ 2024-08-08T21:01:07.6409232Z  -e GITHUB_RUN_ATTEMPT \ 2024-08-08T21:01:07.6409627Z  -e JOB_ID \ 2024-08-08T21:01:07.6409966Z  -e JOB_NAME \ 2024-08-08T21:01:07.6410306Z  -e BASE_SHA \ 2024-08-08T21:01:07.6410650Z  -e BRANCH \ 2024-08-08T21:01:07.6410985Z  -e SHA1 \ 2024-08-08T21:01:07.6411322Z  -e AWS_DEFAULT_REGION \ 2024-08-08T21:01:07.6411724Z  -e IN_WHEEL_TEST \ 2024-08-08T21:01:07.6412101Z  -e SHARD_NUMBER \ 2024-08-08T21:01:07.6412463Z  -e TEST_CONFIG \ 2024-08-08T21:01:07.6412832Z  -e NUM_TEST_SHARDS \ 2024-08-08T21:01:07.6413224Z  -e REENABLED_ISSUES \ 2024-08-08T21:01:07.6413671Z  -e CONTINUE_THROUGH_ERROR \ 2024-08-08T21:01:07.6414100Z  -e VERBOSE_TEST_LOGS \ 2024-08-08T21:01:07.6414502Z  -e TEST_SHOWLOCALS \ 2024-08-08T21:01:07.6414879Z  -e NO_TEST_TIMEOUT \ 2024-08-08T21:01:07.6415633Z  -e NO_TD \ 2024-08-08T21:01:07.6415978Z  -e TD_DISTRIBUTED \ 2024-08-08T21:01:07.6416351Z  -e PR_LABELS \ 2024-08-08T21:01:07.6416752Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2024-08-08T21:01:07.6417207Z  -e SCCACHE_BUCKET \ 2024-08-08T21:01:07.6417591Z  -e SCCACHE_S3_KEY_PREFIX \ 2024-08-08T21:01:07.6417994Z  -e XLA_CUDA \ 2024-08-08T21:01:07.6418397Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2024-08-08T21:01:07.6418897Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2024-08-08T21:01:07.6419414Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2024-08-08T21:01:07.6419924Z  -e SKIP_SCCACHE_INITIALIZATION=1 \ 2024-08-08T21:01:07.6420379Z  -e HUGGING_FACE_HUB_TOKEN \ 2024-08-08T21:01:07.6420799Z  -e DASHBOARD_TAG \ 2024-08-08T21:01:07.6421275Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2024-08-08T21:01:07.6421814Z  --security-opt seccomp=unconfined \ 2024-08-08T21:01:07.6422459Z  --cap-add=SYS_PTRACE \ 2024-08-08T21:01:07.6422850Z  --ipc=host \ 2024-08-08T21:01:07.6423204Z  --shm-size="${SHM_SIZE}" \ 2024-08-08T21:01:07.6423589Z  --tty \ 2024-08-08T21:01:07.6423900Z  --detach \ 2024-08-08T21:01:07.6424259Z  --name="${container_name}" \ 2024-08-08T21:01:07.6424812Z  --user jenkins \ 2024-08-08T21:01:07.6425297Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2024-08-08T21:01:07.6425848Z  -w /var/lib/jenkins/workspace \ 2024-08-08T21:01:07.6426271Z  "${DOCKER_IMAGE}" 2024-08-08T21:01:07.6426624Z ) 2024-08-08T21:01:07.6427027Z # Propagate download.pytorch.org IP to container 2024-08-08T21:01:07.6427919Z grep download.pytorch.org /etc/hosts | docker exec -i "${container_name}" sudo bash -c "/bin/cat >> /etc/hosts" 2024-08-08T21:01:07.6428865Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2024-08-08T21:01:07.6429767Z docker exec -t "${container_name}" sh -c "pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2024-08-08T21:01:07.6438242Z shell: /usr/bin/bash -e {0} 2024-08-08T21:01:07.6438589Z env: 2024-08-08T21:01:07.6438864Z GIT_DEFAULT_BRANCH: main 2024-08-08T21:01:07.6439308Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:07.6439914Z BUILD_ENVIRONMENT: linux-focal-cuda12.4-py3.10-gcc9-sm86 2024-08-08T21:01:07.6440410Z PR_NUMBER: 2024-08-08T21:01:07.6440731Z GITHUB_REPOSITORY: pytorch/pytorch 2024-08-08T21:01:07.6441134Z GITHUB_WORKFLOW: trunk 2024-08-08T21:01:07.6441470Z GITHUB_JOB: test 2024-08-08T21:01:07.6441783Z GITHUB_RUN_ID: 10308807531 2024-08-08T21:01:07.6442135Z GITHUB_RUN_NUMBER: 90547 2024-08-08T21:01:07.6442487Z GITHUB_RUN_ATTEMPT: 1 2024-08-08T21:01:07.6442817Z JOB_ID: 28538801350 2024-08-08T21:01:07.6443514Z JOB_NAME: linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T21:01:07.6444330Z BRANCH: 2024-08-08T21:01:07.6444661Z SHA1: b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:07.6445165Z BASE_SHA: b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:07.6445620Z TEST_CONFIG: default 2024-08-08T21:01:07.6445946Z SHARD_NUMBER: 2 2024-08-08T21:01:07.6446246Z NUM_TEST_SHARDS: 5 2024-08-08T21:01:07.6446568Z REENABLED_ISSUES: 2024-08-08T21:01:07.6446899Z CONTINUE_THROUGH_ERROR: False 2024-08-08T21:01:07.6447275Z VERBOSE_TEST_LOGS: False 2024-08-08T21:01:07.6447629Z TEST_SHOWLOCALS: False 2024-08-08T21:01:07.6447972Z NO_TEST_TIMEOUT: False 2024-08-08T21:01:07.6448294Z NO_TD: False 2024-08-08T21:01:07.6448597Z TD_DISTRIBUTED: False 2024-08-08T21:01:07.6449008Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2024-08-08T21:01:07.6449482Z SCCACHE_S3_KEY_PREFIX: trunk 2024-08-08T21:01:07.6449839Z SHM_SIZE: 2g 2024-08-08T21:01:07.6450777Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T21:01:07.6451796Z XLA_CUDA: 2024-08-08T21:01:07.6452270Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2024-08-08T21:01:07.6452873Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2024-08-08T21:01:07.6453295Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2024-08-08T21:01:07.6453703Z DASHBOARD_TAG: 2024-08-08T21:01:07.6454018Z HUGGING_FACE_HUB_TOKEN: 2024-08-08T21:01:07.6454350Z ##[endgroup] 2024-08-08T21:01:07.6483408Z + [[ default == \m\u\l\t\i\g\p\u ]] 2024-08-08T21:01:07.6484195Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *onnx* ]] 2024-08-08T21:01:07.6484712Z + TEST_COMMAND=.ci/pytorch/test.sh 2024-08-08T21:01:07.6492868Z +++ nproc --ignore=2 2024-08-08T21:01:07.6512172Z ++ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e TD_DISTRIBUTED -e PR_LABELS -e MAX_JOBS=14 -e SCCACHE_BUCKET -e SCCACHE_S3_KEY_PREFIX -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e SKIP_SCCACHE_INITIALIZATION=1 -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/tmp/github_env_10308807531 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T21:01:19.1424547Z + container_name=44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T21:01:19.1430014Z + grep download.pytorch.org /etc/hosts 2024-08-08T21:01:19.1432322Z + docker exec -i 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 sudo bash -c '/bin/cat >> /etc/hosts' 2024-08-08T21:01:19.2776338Z + echo DOCKER_CONTAINER_ID=44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T21:01:19.2780072Z ++ echo dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl 2024-08-08T21:01:19.2782947Z + docker exec -t 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 sh -c 'pip install dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl[opt-einsum] && .ci/pytorch/test.sh' 2024-08-08T21:01:19.6934981Z Processing ./dist/torch-2.5.0a0+gitb9d86fa-cp310-cp310-linux_x86_64.whl (from torch==2.5.0a0+gitb9d86fa) 2024-08-08T21:01:20.0161638Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (3.13.1) 2024-08-08T21:01:20.0165372Z Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (4.12.2) 2024-08-08T21:01:20.0167920Z Requirement already satisfied: networkx in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (2.8.8) 2024-08-08T21:01:20.0171418Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (3.1.4) 2024-08-08T21:01:20.0174592Z Requirement already satisfied: fsspec in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (2024.6.1) 2024-08-08T21:01:20.0184627Z Requirement already satisfied: sympy>=1.13.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (1.13.1) 2024-08-08T21:01:20.0214793Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (3.3.0) 2024-08-08T21:01:20.0272812Z Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (1.21.2) 2024-08-08T21:01:20.0306230Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.0->torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (1.3.0) 2024-08-08T21:01:20.1197065Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.5.0a0+gitb9d86fa->torch==2.5.0a0+gitb9d86fa) (2.1.5) 2024-08-08T21:01:20.9359014Z Installing collected packages: torch 2024-08-08T21:01:31.5897430Z Successfully installed torch-2.5.0a0+gitb9d86fa 2024-08-08T21:01:31.6677850Z ++ dirname .ci/pytorch/test.sh 2024-08-08T21:01:31.6686879Z + source .ci/pytorch/common.sh 2024-08-08T21:01:31.6690558Z +++ dirname .ci/pytorch/common.sh 2024-08-08T21:01:31.6699716Z ++ source .ci/pytorch/common_utils.sh 2024-08-08T21:01:31.6701821Z +++ declare -f -t trap_add 2024-08-08T21:01:31.6707062Z ++ set -ex 2024-08-08T21:01:31.6707729Z ++ [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *rocm* ]] 2024-08-08T21:01:31.6708532Z ++ BUILD_TEST_LIBTORCH=0 2024-08-08T21:01:31.6711307Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-08T21:01:31.6712050Z ++ stat -c %u /var/lib/jenkins/workspace 2024-08-08T21:01:31.6730341Z + WORKSPACE_ORIGINAL_OWNER_ID=1000 2024-08-08T21:01:31.6730916Z + trap_add cleanup_workspace EXIT 2024-08-08T21:01:31.6731350Z + trap_add_cmd=cleanup_workspace 2024-08-08T21:01:31.6731710Z + shift 2024-08-08T21:01:31.6731990Z + for trap_add_name in "$@" 2024-08-08T21:01:31.6738593Z +++ trap -p EXIT 2024-08-08T21:01:31.6742097Z ++ eval 'extract_trap_cmd ' 2024-08-08T21:01:31.6742553Z +++ extract_trap_cmd 2024-08-08T21:01:31.6742939Z +++ printf '%s\n' '' 2024-08-08T21:01:31.6743322Z ++ printf '%s\n' cleanup_workspace 2024-08-08T21:01:31.6745544Z + trap -- ' 2024-08-08T21:01:31.6745904Z cleanup_workspace' EXIT 2024-08-08T21:01:31.6746359Z + sudo chown -R jenkins /var/lib/jenkins/workspace 2024-08-08T21:01:32.3733257Z + git config --global --add safe.directory /var/lib/jenkins/workspace 2024-08-08T21:01:32.3755196Z + echo 'Environment variables:' 2024-08-08T21:01:32.3755608Z Environment variables: 2024-08-08T21:01:32.3755923Z + env 2024-08-08T21:01:32.3765777Z INSTALLED_DB=yes 2024-08-08T21:01:32.3766336Z NV_LIBCUBLAS_VERSION=12.4.2.65-1 2024-08-08T21:01:32.3766892Z NVIDIA_VISIBLE_DEVICES=all 2024-08-08T21:01:32.3767413Z NV_NVML_DEV_VERSION=12.4.99-1 2024-08-08T21:01:32.3768087Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-08T21:01:32.3768776Z CONTINUE_THROUGH_ERROR=False 2024-08-08T21:01:32.3769292Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.20.5-1+cuda12.4 2024-08-08T21:01:32.3769830Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.20.5-1 2024-08-08T21:01:32.3770403Z BUILD_ENVIRONMENT=linux-focal-cuda12.4-py3.10-gcc9-sm86 2024-08-08T21:01:32.3770915Z HOSTNAME=44a07f2f048d 2024-08-08T21:01:32.3771775Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.3773875Z GITHUB_ACTION=__self 2024-08-08T21:01:32.3774796Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2024-08-08T21:01:32.3780711Z NVIDIA_REQUIRE_CUDA=cuda>=12.4 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=tesla,driver>=525,driver<526 brand=unknown,driver>=525,driver<526 brand=nvidia,driver>=525,driver<526 brand=nvidiartx,driver>=525,driver<526 brand=geforce,driver>=525,driver<526 brand=geforcertx,driver>=525,driver<526 brand=quadro,driver>=525,driver<526 brand=quadrortx,driver>=525,driver<526 brand=titan,driver>=525,driver<526 brand=titanrtx,driver>=525,driver<526 brand=tesla,driver>=535,driver<536 brand=unknown,driver>=535,driver<536 brand=nvidia,driver>=535,driver<536 brand=nvidiartx,driver>=535,driver<536 brand=geforce,driver>=535,driver<536 brand=geforcertx,driver>=535,driver<536 brand=quadro,driver>=535,driver<536 brand=quadrortx,driver>=535,driver<536 brand=titan,driver>=535,driver<536 brand=titanrtx,driver>=535,driver<536 2024-08-08T21:01:32.3787002Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-12-4=12.4.2.65-1 2024-08-08T21:01:32.3787513Z NV_NVTX_VERSION=12.4.99-1 2024-08-08T21:01:32.3787858Z GITHUB_RUN_NUMBER=90547 2024-08-08T21:01:32.3788178Z TEST_CONFIG=default 2024-08-08T21:01:32.3788512Z GITHUB_REPOSITORY_OWNER_ID=21003710 2024-08-08T21:01:32.3788987Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2024-08-08T21:01:32.3789436Z NV_CUDA_CUDART_DEV_VERSION=12.4.99-1 2024-08-08T21:01:32.3790146Z NV_LIBCUSPARSE_VERSION=12.3.0.142-1 2024-08-08T21:01:32.3790636Z NV_LIBNPP_VERSION=12.2.5.2-1 2024-08-08T21:01:32.3791098Z GITHUB_TRIGGERING_ACTOR=pytorch-bot[bot] 2024-08-08T21:01:32.3791605Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2024-08-08T21:01:32.3792191Z GITHUB_REF_TYPE=tag 2024-08-08T21:01:32.3792516Z TORCH_CUDA_ARCH_LIST=Maxwell 2024-08-08T21:01:32.3793058Z NCCL_VERSION=2.20.5-1 2024-08-08T21:01:32.3793448Z BASE_SHA=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.3793880Z XLA_CUDA= 2024-08-08T21:01:32.3794151Z HUGGING_FACE_HUB_TOKEN= 2024-08-08T21:01:32.3796287Z *** 2024-08-08T21:01:32.3796578Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2024-08-08T21:01:32.3796964Z GITHUB_REPOSITORY_ID=65600975 2024-08-08T21:01:32.3797320Z GITHUB_ACTIONS=true 2024-08-08T21:01:32.3797646Z NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:32.3798151Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-12-4=12.4.99-1 2024-08-08T21:01:32.3798683Z NV_LIBNPP_PACKAGE=libnpp-12-4=12.2.5.2-1 2024-08-08T21:01:32.3799147Z SHA1=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.3799641Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2024-08-08T21:01:32.3800112Z GITHUB_SHA=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.3800848Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk.yml@refs/tags/ciflow/trunk/132710 2024-08-08T21:01:32.3801498Z UCC_HOME=/usr 2024-08-08T21:01:32.3801844Z NV_LIBCUBLAS_DEV_VERSION=12.4.2.65-1 2024-08-08T21:01:32.3802244Z VERBOSE_TEST_LOGS=False 2024-08-08T21:01:32.3802575Z NVIDIA_PRODUCT_NAME=CUDA 2024-08-08T21:01:32.3803027Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-12-4 2024-08-08T21:01:32.3803505Z GITHUB_REF=refs/tags/ciflow/trunk/132710 2024-08-08T21:01:32.3803945Z NV_CUDA_CUDART_VERSION=12.4.99-1 2024-08-08T21:01:32.3804309Z SHARD_NUMBER=2 2024-08-08T21:01:32.3804613Z GITHUB_REF_PROTECTED=false 2024-08-08T21:01:32.3804951Z HOME=/var/lib/jenkins 2024-08-08T21:01:32.3805310Z GITHUB_API_URL=https://api.github.com 2024-08-08T21:01:32.3805753Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2024-08-08T21:01:32.3806214Z UCX_COMMIT=7bb2722ff2187a0cad557ae4a6afa090569f83fb 2024-08-08T21:01:32.3806681Z SCCACHE_S3_KEY_PREFIX=trunk 2024-08-08T21:01:32.3807028Z CUDA_VERSION=12.4.0 2024-08-08T21:01:32.3807440Z NV_LIBCUBLAS_PACKAGE=libcublas-12-4=12.4.2.65-1 2024-08-08T21:01:32.3807870Z NUM_TEST_SHARDS=5 2024-08-08T21:01:32.3808166Z UCX_HOME=/usr 2024-08-08T21:01:32.3808674Z NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE=cuda-nsight-compute-12-4=12.4.0-1 2024-08-08T21:01:32.3809738Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.3811016Z JOB_NAME=linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T21:01:32.3812257Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.3813356Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2024-08-08T21:01:32.3814002Z GITHUB_EVENT_NAME=push 2024-08-08T21:01:32.3814323Z DASHBOARD_TAG= 2024-08-08T21:01:32.3814617Z GITHUB_RUN_ID=10308807531 2024-08-08T21:01:32.3815068Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-12-4=12.2.5.2-1 2024-08-08T21:01:32.3815988Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-12-4 2024-08-08T21:01:32.3816972Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.3817867Z GITHUB_ACTOR=pytorch-bot[bot] 2024-08-08T21:01:32.3818272Z NV_LIBNPP_DEV_VERSION=12.2.5.2-1 2024-08-08T21:01:32.3818626Z PR_NUMBER= 2024-08-08T21:01:32.3818904Z GITHUB_RUN_ATTEMPT=1 2024-08-08T21:01:32.3819233Z ANACONDA_PYTHON_VERSION=3.10 2024-08-08T21:01:32.3819658Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2024-08-08T21:01:32.3820107Z TERM=xterm 2024-08-08T21:01:32.3820437Z NV_LIBCUSPARSE_DEV_VERSION=12.3.0.142-1 2024-08-08T21:01:32.3820835Z INSTALLED_VISION=yes 2024-08-08T21:01:32.3821372Z BRANCH= 2024-08-08T21:01:32.3821652Z OPENSSL_ROOT_DIR=/opt/openssl 2024-08-08T21:01:32.3822034Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2024-08-08T21:01:32.3822444Z CUDA_PATH=/usr/local/cuda 2024-08-08T21:01:32.3823185Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2024-08-08T21:01:32.3823910Z GITHUB_SERVER_URL=https://github.com 2024-08-08T21:01:32.3824519Z UCC_COMMIT=20eae37090a4ce1b32bcce6144ccad0b49943e0b 2024-08-08T21:01:32.3824979Z REENABLED_ISSUES= 2024-08-08T21:01:32.3825264Z SHLVL=1 2024-08-08T21:01:32.3825519Z MAX_JOBS=14 2024-08-08T21:01:32.3825833Z NV_CUDA_LIB_VERSION=12.4.0-1 2024-08-08T21:01:32.3826169Z NVARCH=x86_64 2024-08-08T21:01:32.3826460Z GITHUB_ACTOR_ID=54816060 2024-08-08T21:01:32.3826901Z GITHUB_WORKFLOW_SHA=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.3827406Z GITHUB_REF_NAME=ciflow/trunk/132710 2024-08-08T21:01:32.3827877Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-12-4 2024-08-08T21:01:32.3828538Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2024-08-08T21:01:32.3829100Z GITHUB_JOB=test 2024-08-08T21:01:32.3829495Z NV_LIBNCCL_PACKAGE=libnccl2=2.20.5-1+cuda12.4 2024-08-08T21:01:32.3830038Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-08-08T21:01:32.3830532Z NO_TEST_TIMEOUT=False 2024-08-08T21:01:32.3830851Z TD_DISTRIBUTED=False 2024-08-08T21:01:32.3831225Z NV_CUDA_NSIGHT_COMPUTE_VERSION=12.4.0-1 2024-08-08T21:01:32.3831652Z GITHUB_REPOSITORY=pytorch/pytorch 2024-08-08T21:01:32.3832218Z NV_NVPROF_VERSION=12.4.99-1 2024-08-08T21:01:32.3832575Z GITHUB_RETENTION_DAYS=90 2024-08-08T21:01:32.3832913Z OPENSSL_DIR=/opt/openssl 2024-08-08T21:01:32.3833258Z GITHUB_ACTION_REPOSITORY= 2024-08-08T21:01:32.3834265Z PATH=/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-08T21:01:32.3835295Z GITHUB_BASE_REF= 2024-08-08T21:01:32.3835614Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2024-08-08T21:01:32.3835985Z CI=true 2024-08-08T21:01:32.3836289Z NV_LIBNCCL_PACKAGE_VERSION=2.20.5-1 2024-08-08T21:01:32.3836695Z GITHUB_REPOSITORY_OWNER=pytorch 2024-08-08T21:01:32.3837060Z JOB_ID=28538801350 2024-08-08T21:01:32.3837357Z INSTALLED_PROTOBUF=yes 2024-08-08T21:01:32.3837678Z GITHUB_HEAD_REF= 2024-08-08T21:01:32.3837972Z GITHUB_ACTION_REF= 2024-08-08T21:01:32.3838395Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2024-08-08T21:01:32.3838859Z TEST_SHOWLOCALS=False 2024-08-08T21:01:32.3839181Z GITHUB_WORKFLOW=trunk 2024-08-08T21:01:32.3839511Z DEBIAN_FRONTEND=noninteractive 2024-08-08T21:01:32.3840408Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.3841205Z NO_TD=False 2024-08-08T21:01:32.3841497Z SKIP_SCCACHE_INITIALIZATION=1 2024-08-08T21:01:32.3841854Z _=/usr/bin/env 2024-08-08T21:01:32.3842325Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2024-08-08T21:01:32.3946851Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2024-08-08T21:01:32.3948061Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T21:01:32.3949070Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2024-08-08T21:01:32.3950002Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2024-08-08T21:01:32.3950590Z + BUILD_DIR=build 2024-08-08T21:01:32.3950931Z + BUILD_RENAMED_DIR=build_renamed 2024-08-08T21:01:32.3951355Z + BUILD_BIN_DIR=build/bin 2024-08-08T21:01:32.3951683Z + SHARD_NUMBER=2 2024-08-08T21:01:32.3952171Z + NUM_TEST_SHARDS=5 2024-08-08T21:01:32.3952566Z + export VALGRIND=ON 2024-08-08T21:01:32.3952888Z + VALGRIND=ON 2024-08-08T21:01:32.3953517Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *clang9* ]] 2024-08-08T21:01:32.3954044Z + [[ 0 == \1 ]] 2024-08-08T21:01:32.3954333Z + [[ False == \1 ]] 2024-08-08T21:01:32.3954814Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *bazel* ]] 2024-08-08T21:01:32.3955514Z ++ realpath build/custom_test_artifacts 2024-08-08T21:01:32.3965006Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2024-08-08T21:01:32.3965701Z + [[ -n '' ]] 2024-08-08T21:01:32.3966052Z + echo 'Environment variables' 2024-08-08T21:01:32.3966418Z Environment variables 2024-08-08T21:01:32.3966729Z + env 2024-08-08T21:01:32.3973666Z INSTALLED_DB=yes 2024-08-08T21:01:32.3974410Z NV_LIBCUBLAS_VERSION=12.4.2.65-1 2024-08-08T21:01:32.3974945Z NVIDIA_VISIBLE_DEVICES=all 2024-08-08T21:01:32.3975461Z NV_NVML_DEV_VERSION=12.4.99-1 2024-08-08T21:01:32.3976234Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-08T21:01:32.3976971Z CONTINUE_THROUGH_ERROR=False 2024-08-08T21:01:32.3977631Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.20.5-1+cuda12.4 2024-08-08T21:01:32.3978358Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.20.5-1 2024-08-08T21:01:32.3979096Z BUILD_ENVIRONMENT=linux-focal-cuda12.4-py3.10-gcc9-sm86 2024-08-08T21:01:32.3979745Z HOSTNAME=44a07f2f048d 2024-08-08T21:01:32.3980865Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.3981856Z GITHUB_ACTION=__self 2024-08-08T21:01:32.3982197Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2024-08-08T21:01:32.3987664Z NVIDIA_REQUIRE_CUDA=cuda>=12.4 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=tesla,driver>=525,driver<526 brand=unknown,driver>=525,driver<526 brand=nvidia,driver>=525,driver<526 brand=nvidiartx,driver>=525,driver<526 brand=geforce,driver>=525,driver<526 brand=geforcertx,driver>=525,driver<526 brand=quadro,driver>=525,driver<526 brand=quadrortx,driver>=525,driver<526 brand=titan,driver>=525,driver<526 brand=titanrtx,driver>=525,driver<526 brand=tesla,driver>=535,driver<536 brand=unknown,driver>=535,driver<536 brand=nvidia,driver>=535,driver<536 brand=nvidiartx,driver>=535,driver<536 brand=geforce,driver>=535,driver<536 brand=geforcertx,driver>=535,driver<536 brand=quadro,driver>=535,driver<536 brand=quadrortx,driver>=535,driver<536 brand=titan,driver>=535,driver<536 brand=titanrtx,driver>=535,driver<536 2024-08-08T21:01:32.3993620Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-12-4=12.4.2.65-1 2024-08-08T21:01:32.3994126Z NV_NVTX_VERSION=12.4.99-1 2024-08-08T21:01:32.3994468Z GITHUB_RUN_NUMBER=90547 2024-08-08T21:01:32.3994796Z TEST_CONFIG=default 2024-08-08T21:01:32.3995127Z GITHUB_REPOSITORY_OWNER_ID=21003710 2024-08-08T21:01:32.3995591Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2024-08-08T21:01:32.3996056Z NV_CUDA_CUDART_DEV_VERSION=12.4.99-1 2024-08-08T21:01:32.3996493Z NV_LIBCUSPARSE_VERSION=12.3.0.142-1 2024-08-08T21:01:32.3996906Z NV_LIBNPP_VERSION=12.2.5.2-1 2024-08-08T21:01:32.3997343Z GITHUB_TRIGGERING_ACTOR=pytorch-bot[bot] 2024-08-08T21:01:32.3997839Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2024-08-08T21:01:32.3998289Z GITHUB_REF_TYPE=tag 2024-08-08T21:01:32.3998609Z TORCH_CUDA_ARCH_LIST=Maxwell 2024-08-08T21:01:32.3998986Z NCCL_VERSION=2.20.5-1 2024-08-08T21:01:32.3999364Z BASE_SHA=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.3999797Z XLA_CUDA= 2024-08-08T21:01:32.4000081Z HUGGING_FACE_HUB_TOKEN= 2024-08-08T21:01:32.4000464Z *** 2024-08-08T21:01:32.4000750Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2024-08-08T21:01:32.4001143Z GITHUB_REPOSITORY_ID=65600975 2024-08-08T21:01:32.4001495Z GITHUB_ACTIONS=true 2024-08-08T21:01:32.4001824Z NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T21:01:32.4002312Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-12-4=12.4.99-1 2024-08-08T21:01:32.4002837Z NV_LIBNPP_PACKAGE=libnpp-12-4=12.2.5.2-1 2024-08-08T21:01:32.4003292Z SHA1=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.4003796Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2024-08-08T21:01:32.4004421Z GITHUB_SHA=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.4005158Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk.yml@refs/tags/ciflow/trunk/132710 2024-08-08T21:01:32.4005820Z UCC_HOME=/usr 2024-08-08T21:01:32.4006156Z NV_LIBCUBLAS_DEV_VERSION=12.4.2.65-1 2024-08-08T21:01:32.4006562Z VERBOSE_TEST_LOGS=False 2024-08-08T21:01:32.4006902Z NVIDIA_PRODUCT_NAME=CUDA 2024-08-08T21:01:32.4007445Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-12-4 2024-08-08T21:01:32.4007931Z GITHUB_REF=refs/tags/ciflow/trunk/132710 2024-08-08T21:01:32.4008385Z NV_CUDA_CUDART_VERSION=12.4.99-1 2024-08-08T21:01:32.4008745Z SHARD_NUMBER=2 2024-08-08T21:01:32.4009051Z GITHUB_REF_PROTECTED=false 2024-08-08T21:01:32.4009396Z HOME=/var/lib/jenkins 2024-08-08T21:01:32.4009745Z GITHUB_API_URL=https://api.github.com 2024-08-08T21:01:32.4010180Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2024-08-08T21:01:32.4010646Z UCX_COMMIT=7bb2722ff2187a0cad557ae4a6afa090569f83fb 2024-08-08T21:01:32.4011112Z SCCACHE_S3_KEY_PREFIX=trunk 2024-08-08T21:01:32.4011467Z CUDA_VERSION=12.4.0 2024-08-08T21:01:32.4011890Z NV_LIBCUBLAS_PACKAGE=libcublas-12-4=12.4.2.65-1 2024-08-08T21:01:32.4012315Z NUM_TEST_SHARDS=5 2024-08-08T21:01:32.4012613Z UCX_HOME=/usr 2024-08-08T21:01:32.4013120Z NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE=cuda-nsight-compute-12-4=12.4.0-1 2024-08-08T21:01:32.4014188Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.4015728Z JOB_NAME=linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, amz2023.linux.g5.4xlarge.nvidia.gpu) 2024-08-08T21:01:32.4016998Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.4018091Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2024-08-08T21:01:32.4018721Z GITHUB_EVENT_NAME=push 2024-08-08T21:01:32.4019043Z DASHBOARD_TAG= 2024-08-08T21:01:32.4019339Z GITHUB_RUN_ID=10308807531 2024-08-08T21:01:32.4019798Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-12-4=12.2.5.2-1 2024-08-08T21:01:32.4020327Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-12-4 2024-08-08T21:01:32.4021307Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.4022189Z GITHUB_ACTOR=pytorch-bot[bot] 2024-08-08T21:01:32.4022588Z NV_LIBNPP_DEV_VERSION=12.2.5.2-1 2024-08-08T21:01:32.4022962Z PR_NUMBER= 2024-08-08T21:01:32.4023236Z GITHUB_RUN_ATTEMPT=1 2024-08-08T21:01:32.4023544Z VALGRIND=ON 2024-08-08T21:01:32.4023837Z ANACONDA_PYTHON_VERSION=3.10 2024-08-08T21:01:32.4024263Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2024-08-08T21:01:32.4024712Z TERM=xterm 2024-08-08T21:01:32.4025042Z NV_LIBCUSPARSE_DEV_VERSION=12.3.0.142-1 2024-08-08T21:01:32.4025443Z INSTALLED_VISION=yes 2024-08-08T21:01:32.4025749Z BRANCH= 2024-08-08T21:01:32.4026027Z OPENSSL_ROOT_DIR=/opt/openssl 2024-08-08T21:01:32.4026413Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2024-08-08T21:01:32.4026830Z CUDA_PATH=/usr/local/cuda 2024-08-08T21:01:32.4027570Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2024-08-08T21:01:32.4028294Z GITHUB_SERVER_URL=https://github.com 2024-08-08T21:01:32.4028769Z UCC_COMMIT=20eae37090a4ce1b32bcce6144ccad0b49943e0b 2024-08-08T21:01:32.4029228Z REENABLED_ISSUES= 2024-08-08T21:01:32.4029514Z SHLVL=1 2024-08-08T21:01:32.4029779Z MAX_JOBS=14 2024-08-08T21:01:32.4030086Z NV_CUDA_LIB_VERSION=12.4.0-1 2024-08-08T21:01:32.4030424Z NVARCH=x86_64 2024-08-08T21:01:32.4030733Z GITHUB_ACTOR_ID=54816060 2024-08-08T21:01:32.4031178Z GITHUB_WORKFLOW_SHA=b9d86fa89636e301796d4201f36d86c73f6e49bc 2024-08-08T21:01:32.4031685Z GITHUB_REF_NAME=ciflow/trunk/132710 2024-08-08T21:01:32.4032276Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-12-4 2024-08-08T21:01:32.4032928Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2024-08-08T21:01:32.4033482Z GITHUB_JOB=test 2024-08-08T21:01:32.4033873Z NV_LIBNCCL_PACKAGE=libnccl2=2.20.5-1+cuda12.4 2024-08-08T21:01:32.4034574Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-08-08T21:01:32.4035074Z NO_TEST_TIMEOUT=False 2024-08-08T21:01:32.4035388Z TD_DISTRIBUTED=False 2024-08-08T21:01:32.4035762Z NV_CUDA_NSIGHT_COMPUTE_VERSION=12.4.0-1 2024-08-08T21:01:32.4036188Z GITHUB_REPOSITORY=pytorch/pytorch 2024-08-08T21:01:32.4036591Z NV_NVPROF_VERSION=12.4.99-1 2024-08-08T21:01:32.4037068Z GITHUB_RETENTION_DAYS=90 2024-08-08T21:01:32.4037419Z OPENSSL_DIR=/opt/openssl 2024-08-08T21:01:32.4037754Z GITHUB_ACTION_REPOSITORY= 2024-08-08T21:01:32.4038770Z PATH=/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-08T21:01:32.4039802Z GITHUB_BASE_REF= 2024-08-08T21:01:32.4040114Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2024-08-08T21:01:32.4040474Z CI=true 2024-08-08T21:01:32.4040787Z NV_LIBNCCL_PACKAGE_VERSION=2.20.5-1 2024-08-08T21:01:32.4041186Z GITHUB_REPOSITORY_OWNER=pytorch 2024-08-08T21:01:32.4041561Z JOB_ID=28538801350 2024-08-08T21:01:32.4041866Z INSTALLED_PROTOBUF=yes 2024-08-08T21:01:32.4042183Z GITHUB_HEAD_REF= 2024-08-08T21:01:32.4042482Z GITHUB_ACTION_REF= 2024-08-08T21:01:32.4042911Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2024-08-08T21:01:32.4043359Z TEST_SHOWLOCALS=False 2024-08-08T21:01:32.4043681Z GITHUB_WORKFLOW=trunk 2024-08-08T21:01:32.4044021Z DEBIAN_FRONTEND=noninteractive 2024-08-08T21:01:32.4044910Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_3e8b4a40-6de9-4b3d-aa43-0eca2c83df8d 2024-08-08T21:01:32.4045701Z NO_TD=False 2024-08-08T21:01:32.4045998Z SKIP_SCCACHE_INITIALIZATION=1 2024-08-08T21:01:32.4046348Z _=/usr/bin/env 2024-08-08T21:01:32.4046682Z + echo 'Testing pytorch' 2024-08-08T21:01:32.4047020Z Testing pytorch 2024-08-08T21:01:32.4047332Z + export LANG=C.UTF-8 2024-08-08T21:01:32.4047664Z + LANG=C.UTF-8 2024-08-08T21:01:32.4047941Z + PR_NUMBER= 2024-08-08T21:01:32.4048248Z + [[ default == \d\e\f\a\u\l\t ]] 2024-08-08T21:01:32.4048643Z + export CUDA_VISIBLE_DEVICES=0 2024-08-08T21:01:32.4049014Z + CUDA_VISIBLE_DEVICES=0 2024-08-08T21:01:32.4049364Z + export HIP_VISIBLE_DEVICES=0 2024-08-08T21:01:32.4049735Z + HIP_VISIBLE_DEVICES=0 2024-08-08T21:01:32.4050092Z + [[ default == \d\i\s\t\r\i\b\u\t\e\d ]] 2024-08-08T21:01:32.4050509Z + [[ default == \s\l\o\w ]] 2024-08-08T21:01:32.4051082Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *slow-gradcheck* ]] 2024-08-08T21:01:32.4051761Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *cuda* ]] 2024-08-08T21:01:32.4052303Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2024-08-08T21:01:32.4052770Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2024-08-08T21:01:32.4053192Z + [[ default == *crossref* ]] 2024-08-08T21:01:32.4053702Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *rocm* ]] 2024-08-08T21:01:32.4054332Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *xpu* ]] 2024-08-08T21:01:32.4054974Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *-bazel-* ]] 2024-08-08T21:01:32.4055529Z + pip_install --user ninja==1.10.2 2024-08-08T21:01:32.4056053Z + pip install --progress-bar off --user ninja==1.10.2 2024-08-08T21:01:33.7426891Z Collecting ninja==1.10.2 2024-08-08T21:01:33.7595463Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2024-08-08T21:01:33.7964422Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2024-08-08T21:01:34.6786840Z Installing collected packages: ninja 2024-08-08T21:01:34.6870003Z  WARNING: The script ninja is installed in '/var/lib/jenkins/.local/bin' which is not on PATH. 2024-08-08T21:01:34.6871265Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2024-08-08T21:01:34.6928181Z Successfully installed ninja-1.10.2 2024-08-08T21:01:34.7693065Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-08T21:01:34.7695606Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-08T21:01:34.7697159Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *aarch64* ]] 2024-08-08T21:01:34.7697660Z + install_tlparse 2024-08-08T21:01:34.7698222Z + pip_install --user tlparse==0.3.7 2024-08-08T21:01:34.7698782Z + pip install --progress-bar off --user tlparse==0.3.7 2024-08-08T21:01:35.1725422Z Collecting tlparse==0.3.7 2024-08-08T21:01:35.1875542Z Downloading tlparse-0.3.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (346 bytes) 2024-08-08T21:01:35.1972863Z Downloading tlparse-0.3.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB) 2024-08-08T21:01:36.0823021Z Installing collected packages: tlparse 2024-08-08T21:01:36.1229650Z Successfully installed tlparse-0.3.7 2024-08-08T21:01:36.1988519Z ++ python -m site --user-base 2024-08-08T21:01:36.2173881Z + PATH=/var/lib/jenkins/.local/bin:/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2024-08-08T21:01:36.2177271Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *asan* ]] 2024-08-08T21:01:36.2178596Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *-debug* ]] 2024-08-08T21:01:36.2180077Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *-bazel-* ]] 2024-08-08T21:01:36.2181959Z + echo 'We are not in debug mode: linux-focal-cuda12.4-py3.10-gcc9-sm86. Expect the assertion to pass' 2024-08-08T21:01:36.2183177Z We are not in debug mode: linux-focal-cuda12.4-py3.10-gcc9-sm86. Expect the assertion to pass 2024-08-08T21:01:36.2183864Z + cd test 2024-08-08T21:01:36.2184393Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2024-08-08T21:01:37.8802346Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2024-08-08T21:01:37.8802909Z + [[ default == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2024-08-08T21:01:37.8806120Z + DYNAMO_BENCHMARK_FLAGS=() 2024-08-08T21:01:37.8806591Z + [[ default == *dynamo_eager* ]] 2024-08-08T21:01:37.8807107Z + [[ default == *aot_eager* ]] 2024-08-08T21:01:37.8807553Z + [[ default == *aot_inductor* ]] 2024-08-08T21:01:37.8807958Z + [[ default == *inductor* ]] 2024-08-08T21:01:37.8808341Z + [[ default == *dynamic* ]] 2024-08-08T21:01:37.8808696Z + [[ default == *cpu* ]] 2024-08-08T21:01:37.8809311Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2024-08-08T21:01:37.8839075Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *libtorch* ]] 2024-08-08T21:01:37.8839923Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *-bazel-* ]] 2024-08-08T21:01:37.8842167Z + cd test 2024-08-08T21:01:37.8842819Z + python -c 'import torch; print(torch.__config__.show())' 2024-08-08T21:01:39.3738754Z PyTorch built with: 2024-08-08T21:01:39.3739515Z - GCC 9.4 2024-08-08T21:01:39.3739963Z - C++ Version: 201703 2024-08-08T21:01:39.3741056Z - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications 2024-08-08T21:01:39.3742131Z - Intel(R) MKL-DNN v3.4.2 (Git Hash 1137e04ec0b5251ca2b4400a4fd3c667ce843d67) 2024-08-08T21:01:39.3742788Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2024-08-08T21:01:39.3743303Z - LAPACK is enabled (usually provided by MKL) 2024-08-08T21:01:39.3743801Z - NNPACK is enabled 2024-08-08T21:01:39.3744179Z - CPU capability usage: AVX2 2024-08-08T21:01:39.3744584Z - CUDA Runtime 12.4 2024-08-08T21:01:39.3745101Z - NVCC architecture flags: -gencode;arch=compute_86,code=sm_86 2024-08-08T21:01:39.3745640Z - CuDNN 90.1 2024-08-08T21:01:39.3745966Z - Magma 2.6.1 2024-08-08T21:01:39.3753158Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.4, CUDNN_VERSION=9.1.0, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Werror -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.5.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 2024-08-08T21:01:39.3760680Z 2024-08-08T21:01:39.6964660Z + cd test 2024-08-08T21:01:39.6965464Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2024-08-08T21:01:41.0774434Z ATen/Parallel: 2024-08-08T21:01:41.0774991Z at::get_num_threads() : 8 2024-08-08T21:01:41.0775508Z at::get_num_interop_threads() : 16 2024-08-08T21:01:41.0776052Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2024-08-08T21:01:41.0776560Z omp_get_max_threads() : 8 2024-08-08T21:01:41.0777795Z Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications 2024-08-08T21:01:41.0778588Z mkl_get_max_threads() : 8 2024-08-08T21:01:41.0779224Z Intel(R) MKL-DNN v3.4.2 (Git Hash 1137e04ec0b5251ca2b4400a4fd3c667ce843d67) 2024-08-08T21:01:41.0779857Z std::thread::hardware_concurrency() : 16 2024-08-08T21:01:41.0780279Z Environment variables: 2024-08-08T21:01:41.0780630Z OMP_NUM_THREADS : [not set] 2024-08-08T21:01:41.0781005Z MKL_NUM_THREADS : [not set] 2024-08-08T21:01:41.0781375Z ATen parallel backend: OpenMP 2024-08-08T21:01:41.0781642Z 2024-08-08T21:01:41.3722170Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *aarch64* ]] 2024-08-08T21:01:41.3722928Z + [[ default == *backward* ]] 2024-08-08T21:01:41.3723408Z + [[ default == *xla* ]] 2024-08-08T21:01:41.3723859Z + [[ default == *executorch* ]] 2024-08-08T21:01:41.3724253Z + [[ default == \j\i\t\_\l\e\g\a\c\y ]] 2024-08-08T21:01:41.3724878Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *libtorch* ]] 2024-08-08T21:01:41.3725395Z + [[ default == distributed ]] 2024-08-08T21:01:41.3725787Z + [[ default == *inductor_distributed* ]] 2024-08-08T21:01:41.3726271Z + [[ default == *inductor-halide* ]] 2024-08-08T21:01:41.3726772Z + [[ default == *inductor-micro-benchmark* ]] 2024-08-08T21:01:41.3727216Z + [[ default == *huggingface* ]] 2024-08-08T21:01:41.3727594Z + [[ default == *timm* ]] 2024-08-08T21:01:41.3727942Z + [[ default == *torchbench* ]] 2024-08-08T21:01:41.3728405Z + [[ default == *inductor_cpp_wrapper_abi_compatible* ]] 2024-08-08T21:01:41.3728964Z + [[ default == *inductor* ]] 2024-08-08T21:01:41.3729349Z + [[ default == *dynamo* ]] 2024-08-08T21:01:41.3729862Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *rocm* ]] 2024-08-08T21:01:41.3730348Z + [[ 2 == 1 ]] 2024-08-08T21:01:41.3730634Z + [[ 2 == 2 ]] 2024-08-08T21:01:41.3730958Z + [[ 5 -gt 1 ]] 2024-08-08T21:01:41.3731266Z + install_torchvision 2024-08-08T21:01:41.3731589Z + local orig_preload 2024-08-08T21:01:41.3731906Z + local commit 2024-08-08T21:01:41.3732214Z ++ get_pinned_commit vision 2024-08-08T21:01:41.3732603Z ++ cat .github/ci_commit_pins/vision.txt 2024-08-08T21:01:41.3746703Z + commit=d23a6e1664d20707c11781299611436e1f0c104f 2024-08-08T21:01:41.3747238Z + orig_preload= 2024-08-08T21:01:41.3747587Z + '[' -n '' ']' 2024-08-08T21:01:41.3748408Z + pip_install --no-use-pep517 --user git+https://github.com/pytorch/vision.git@d23a6e1664d20707c11781299611436e1f0c104f 2024-08-08T21:01:41.3749825Z + pip install --progress-bar off --no-use-pep517 --user git+https://github.com/pytorch/vision.git@d23a6e1664d20707c11781299611436e1f0c104f 2024-08-08T21:01:41.7261836Z Collecting git+https://github.com/pytorch/vision.git@d23a6e1664d20707c11781299611436e1f0c104f 2024-08-08T21:01:41.7267036Z Cloning https://github.com/pytorch/vision.git (to revision d23a6e1664d20707c11781299611436e1f0c104f) to /tmp/pip-req-build-8vql0xn4 2024-08-08T21:01:41.7298440Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-8vql0xn4 2024-08-08T21:01:43.2176190Z Running command git rev-parse -q --verify 'sha^d23a6e1664d20707c11781299611436e1f0c104f' 2024-08-08T21:01:43.2205357Z Running command git fetch -q https://github.com/pytorch/vision.git d23a6e1664d20707c11781299611436e1f0c104f 2024-08-08T21:01:44.5653395Z Running command git checkout -q d23a6e1664d20707c11781299611436e1f0c104f 2024-08-08T21:01:44.8625418Z Resolved https://github.com/pytorch/vision.git to commit d23a6e1664d20707c11781299611436e1f0c104f 2024-08-08T21:01:47.3537599Z Preparing metadata (setup.py) ... [?25l- \ done 2024-08-08T21:01:47.3583394Z [?25hRequirement already satisfied: numpy in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.19.0a0+d23a6e1) (1.21.2) 2024-08-08T21:01:47.3586422Z Requirement already satisfied: torch in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.19.0a0+d23a6e1) (2.5.0a0+gitb9d86fa) 2024-08-08T21:01:47.3590866Z Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torchvision==0.19.0a0+d23a6e1) (10.3.0) 2024-08-08T21:01:47.3811697Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (3.13.1) 2024-08-08T21:01:47.3816414Z Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (4.12.2) 2024-08-08T21:01:47.3819724Z Requirement already satisfied: networkx in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (2.8.8) 2024-08-08T21:01:47.3822119Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (3.1.4) 2024-08-08T21:01:47.3825222Z Requirement already satisfied: fsspec in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (2024.6.1) 2024-08-08T21:01:47.3835199Z Requirement already satisfied: sympy>=1.13.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch->torchvision==0.19.0a0+d23a6e1) (1.13.1) 2024-08-08T21:01:47.3865436Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.0->torch->torchvision==0.19.0a0+d23a6e1) (1.3.0) 2024-08-08T21:01:47.4729022Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch->torchvision==0.19.0a0+d23a6e1) (2.1.5) 2024-08-08T21:01:47.4938570Z Building wheels for collected packages: torchvision 2024-08-08T21:03:03.3097186Z Building wheel for torchvision (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | done 2024-08-08T21:03:03.3128078Z [?25h Created wheel for torchvision: filename=torchvision-0.19.0a0+d23a6e1-cp310-cp310-linux_x86_64.whl size=2025062 sha256=65e960c4df3603c819cad1e05d04d04404a6195d673b05bc891cc7cb1325260d 2024-08-08T21:03:03.3129769Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/0e/56/35/02931e71eb23fd2b85591c7ec05b733ca7c8b328a2fd151f96 2024-08-08T21:03:03.3166389Z Successfully built torchvision 2024-08-08T21:03:04.0091722Z Installing collected packages: torchvision 2024-08-08T21:03:04.4350075Z Successfully installed torchvision-0.19.0a0+d23a6e1 2024-08-08T21:03:04.5722498Z + '[' -n '' ']' 2024-08-08T21:03:04.5722891Z + test_python_shard 2 2024-08-08T21:03:04.5723275Z + [[ -z 5 ]] 2024-08-08T21:03:04.5724571Z + python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --shard 2 5 --verbose 2024-08-08T21:03:04.6701801Z /var/lib/jenkins/workspace/test/run_test.py:21: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html 2024-08-08T21:03:04.6703964Z import pkg_resources 2024-08-08T21:03:08.2182831Z Downloading https://ossci-metrics.s3.amazonaws.com/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2024-08-08T21:03:08.2629763Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2024-08-08T21:03:08.3129573Z Ignoring disabled issues: [''] 2024-08-08T21:03:08.3241835Z Found test times from artifacts 2024-08-08T21:03:08.3673434Z Found test times from artifacts 2024-08-08T21:03:08.3687122Z Running 25% of tests based on TD 2024-08-08T21:03:08.3977364Z Running parallel tests on 3 processes 2024-08-08T21:03:08.3980461Z Name: tests to run (est. time: 57.2min) 2024-08-08T21:03:08.3980980Z Serial tests (0): 2024-08-08T21:03:08.3981398Z Parallel tests (23): 2024-08-08T21:03:08.3981857Z test_nestedtensor 1/1 2024-08-08T21:03:08.3985528Z test_decomp 4/13 2024-08-08T21:03:08.3985942Z test_decomp 6/13 2024-08-08T21:03:08.3986398Z test_decomp 7/13 2024-08-08T21:03:08.3986894Z inductor/test_torchinductor_opinfo 1/12 2024-08-08T21:03:08.3987518Z inductor/test_torchinductor_opinfo 8/12 2024-08-08T21:03:08.3988122Z inductor/test_torchinductor_opinfo 9/12 2024-08-08T21:03:08.3988551Z functorch/test_ops 4/6 2024-08-08T21:03:08.3988924Z functorch/test_aotdispatch 1/1 2024-08-08T21:03:08.3989476Z inductor/test_cudacodecache 1/1 2024-08-08T21:03:08.3990018Z export/test_retraceability 1/1 2024-08-08T21:03:08.3990482Z test_functionalization 1/1 2024-08-08T21:03:08.3992782Z test_ops 1/8 2024-08-08T21:03:08.3993258Z test_ops 6/8 2024-08-08T21:03:08.3993664Z test_ops 7/8 2024-08-08T21:03:08.3994124Z inductor/test_aot_inductor 3/16 2024-08-08T21:03:08.3994676Z inductor/test_aot_inductor 7/16 2024-08-08T21:03:08.3995187Z inductor/test_aot_inductor 9/16 2024-08-08T21:03:08.3995760Z inductor/test_compiled_optimizers 2/4 2024-08-08T21:03:08.3996270Z inductor/test_compiled_optimizers 3/4 2024-08-08T21:03:08.3996787Z inductor/test_compiled_optimizers 4/4 2024-08-08T21:03:08.3997240Z inductor/test_cuda_cpp_wrapper 3/4 2024-08-08T21:03:08.3997733Z inductor/test_cpu_cpp_wrapper 1/1 2024-08-08T21:03:08.3998152Z Name: excluded (est. time: 25.35min) 2024-08-08T21:03:08.3998613Z Serial tests (4): 2024-08-08T21:03:08.3998948Z inductor/test_flex_attention 2/2 2024-08-08T21:03:08.3999400Z test_fx 1/1 2024-08-08T21:03:08.3999696Z test_reductions 1/1 2024-08-08T21:03:08.4000050Z test_tensorexpr 1/1 2024-08-08T21:03:08.4000433Z Parallel tests (59): 2024-08-08T21:03:08.4000778Z inductor/test_torchinductor 4/5 2024-08-08T21:03:08.4001250Z test_schema_check 1/1 2024-08-08T21:03:08.4001586Z test_linalg 1/1 2024-08-08T21:03:08.4001955Z test_dataloader 1/1 2024-08-08T21:03:08.4002310Z inductor/test_fused_attention 1/1 2024-08-08T21:03:08.4002793Z inductor/test_kernel_benchmark 1/1 2024-08-08T21:03:08.4003203Z test_tensorboard 1/1 2024-08-08T21:03:08.4003663Z torch_np/numpy_tests/core/test_scalarmath 1/1 2024-08-08T21:03:08.4004127Z dynamo/test_logging 1/1 2024-08-08T21:03:08.4004551Z nn/test_embedding 1/1 2024-08-08T21:03:08.4004928Z test_functional_autograd_benchmark 1/1 2024-08-08T21:03:08.4005439Z inductor/test_standalone_compile 1/1 2024-08-08T21:03:08.4005862Z test_scatter_gather_ops 1/1 2024-08-08T21:03:08.4006304Z dynamo/test_modules 1/1 2024-08-08T21:03:08.4006670Z dynamo/test_aot_autograd 1/1 2024-08-08T21:03:08.4007124Z inductor/test_unbacked_symints 1/1 2024-08-08T21:03:08.4007547Z dynamo/test_structured_trace 1/1 2024-08-08T21:03:08.4008009Z export/test_passes 1/1 2024-08-08T21:03:08.4008728Z inductor/test_padding 1/1 2024-08-08T21:03:08.4009121Z functorch/test_ac 1/1 2024-08-08T21:03:08.4009499Z test_fx_experimental 1/1 2024-08-08T21:03:08.4009891Z test_view_ops 1/1 2024-08-08T21:03:08.4010256Z torch_np/numpy_tests/fft/test_helper 1/1 2024-08-08T21:03:08.4010758Z test_public_bindings 1/1 2024-08-08T21:03:08.4011159Z nn/test_multihead_attention 1/1 2024-08-08T21:03:08.4011841Z dynamo/test_autograd_function 1/1 2024-08-08T21:03:08.4012327Z test_dynamic_shapes 1/1 2024-08-08T21:03:08.4012719Z torch_np/numpy_tests/core/test_numeric 1/1 2024-08-08T21:03:08.4013235Z inductor/test_loop_ordering 1/1 2024-08-08T21:03:08.4013634Z export/test_torchbind 1/1 2024-08-08T21:03:08.4014066Z test_segment_reductions 1/1 2024-08-08T21:03:08.4014444Z test_logging 1/1 2024-08-08T21:03:08.4014817Z test_indexing 1/1 2024-08-08T21:03:08.4015495Z dynamo/test_profiler 1/1 2024-08-08T21:03:08.4015926Z torch_np/numpy_tests/lib/test_function_base 1/1 2024-08-08T21:03:08.4016471Z dynamo/test_trace_rules 1/1 2024-08-08T21:03:08.4016849Z nn/test_parametrization 1/1 2024-08-08T21:03:08.4017285Z test_namedtensor 1/1 2024-08-08T21:03:08.4017637Z dynamo/test_recompiles 1/1 2024-08-08T21:03:08.4018057Z test_fx_passes 1/1 2024-08-08T21:03:08.4018442Z torch_np/numpy_tests/lib/test_shape_base_ 1/1 2024-08-08T21:03:08.4018958Z dynamo/test_view 1/1 2024-08-08T21:03:08.4019294Z dynamo/test_config 1/1 2024-08-08T21:03:08.4019704Z lazy/test_reuse_ir 1/1 2024-08-08T21:03:08.4020067Z dynamo/test_debug_utils 1/1 2024-08-08T21:03:08.4020480Z test_fx_reinplace_pass 1/1 2024-08-08T21:03:08.4020871Z test_subclass 1/1 2024-08-08T21:03:08.4021190Z test_dlpack 1/1 2024-08-08T21:03:08.4021619Z torch_np/numpy_tests/lib/test_index_tricks 1/1 2024-08-08T21:03:08.4022074Z test_cuda_sanitizer 1/1 2024-08-08T21:03:08.4022548Z torch_np/numpy_tests/core/test_numerictypes 1/1 2024-08-08T21:03:08.4023014Z functorch/test_logging 1/1 2024-08-08T21:03:08.4023382Z export/test_schema 1/1 2024-08-08T21:03:08.4023725Z test_itt 1/1 2024-08-08T21:03:08.4024097Z inductor/test_codegen_triton 1/1 2024-08-08T21:03:08.4024492Z test_type_hints 1/1 2024-08-08T21:03:08.4024827Z lazy/test_meta_kernel 1/1 2024-08-08T21:03:08.4025171Z test_hub 1/1 2024-08-08T21:03:08.4025475Z optim/test_optim 1/1 2024-08-08T21:03:08.4026347Z Starting test batch 'tests to run' 0.0 seconds after initiating testing 2024-08-08T21:03:08.4052924Z Running test_nestedtensor 1/1 ... [2024-08-08 21:03:08.404867] 2024-08-08T21:03:08.4057179Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:08.405236] 2024-08-08T21:03:13.0787281Z 2024-08-08T21:03:13.0788816Z test_nestedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.1_f8a9837d23968f84_.log 2024-08-08T21:03:13.0789817Z Running 0 items in this shard: 2024-08-08T21:03:13.0790070Z 2024-08-08T21:03:13.0792441Z Running test_decomp 4/13 ... [2024-08-08 21:03:13.078919] 2024-08-08T21:03:13.0797074Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=4', '--num-shards=13', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:13.079305] 2024-08-08T21:03:19.2046590Z 2024-08-08T21:03:19.2047979Z test_decomp 4/13 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_4.13_fc04e15fd9210c6b_.log 2024-08-08T21:03:19.2049004Z Running 0 items in this shard: 2024-08-08T21:03:19.2049362Z 2024-08-08T21:03:19.2049882Z Running test_decomp 6/13 ... [2024-08-08 21:03:19.204639] 2024-08-08T21:03:19.2054424Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=6', '--num-shards=13', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:19.205024] 2024-08-08T21:03:25.3305628Z 2024-08-08T21:03:25.3307496Z test_decomp 6/13 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_6.13_933edb735d04209e_.log 2024-08-08T21:03:25.3308438Z Running 0 items in this shard: 2024-08-08T21:03:25.3308785Z 2024-08-08T21:03:25.3311233Z Running test_decomp 7/13 ... [2024-08-08 21:03:25.330792] 2024-08-08T21:03:25.3317864Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=7', '--num-shards=13', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:25.331167] 2024-08-08T21:03:31.4076431Z 2024-08-08T21:03:31.4077970Z test_decomp 7/13 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_7.13_2a3e4902a82cefef_.log 2024-08-08T21:03:31.4079133Z Running 0 items in this shard: 2024-08-08T21:03:31.4079400Z 2024-08-08T21:03:31.4082529Z Running inductor/test_torchinductor_opinfo 1/12 ... [2024-08-08 21:03:31.407914] 2024-08-08T21:03:31.4086801Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=1', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:31.408270] 2024-08-08T21:03:40.8915708Z 2024-08-08T21:03:40.8917620Z inductor/test_torchinductor_opinfo 1/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_1.12_bf7ae5ad0edfb927_.log 2024-08-08T21:03:40.8918912Z Running 0 items in this shard: 2024-08-08T21:03:40.8919191Z 2024-08-08T21:03:40.8919660Z Running inductor/test_torchinductor_opinfo 8/12 ... [2024-08-08 21:03:40.891210] 2024-08-08T21:03:40.8921757Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=8', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:40.891614] 2024-08-08T21:03:48.6700031Z 2024-08-08T21:03:48.6702776Z inductor/test_torchinductor_opinfo 8/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.12_7e34231148571d6c_.log 2024-08-08T21:03:48.6704390Z Running 0 items in this shard: 2024-08-08T21:03:48.6704646Z 2024-08-08T21:03:48.6705061Z Running inductor/test_torchinductor_opinfo 9/12 ... [2024-08-08 21:03:48.669862] 2024-08-08T21:03:48.6707391Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=9', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:48.670278] 2024-08-08T21:03:56.4487765Z 2024-08-08T21:03:56.4489525Z inductor/test_torchinductor_opinfo 9/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_9.12_1b32c87723b83937_.log 2024-08-08T21:03:56.4490989Z Running 0 items in this shard: 2024-08-08T21:03:56.4491253Z 2024-08-08T21:03:56.4491595Z Running functorch/test_ops 4/6 ... [2024-08-08 21:03:56.448716] 2024-08-08T21:03:56.4495845Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'serial', '--shard-id=4', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:03:56.449145] 2024-08-08T21:04:03.0251263Z 2024-08-08T21:04:03.0252702Z functorch/test_ops 4/6 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_4.6_3d5e5fa8cb52aace_.log 2024-08-08T21:04:03.0254335Z Running 0 items in this shard: 2024-08-08T21:04:03.0254617Z 2024-08-08T21:04:03.0257627Z Running functorch/test_aotdispatch 1/1 ... [2024-08-08 21:04:03.025150] 2024-08-08T21:04:03.0260235Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aotdispatch.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:03.025568] 2024-08-08T21:04:07.7490228Z 2024-08-08T21:04:07.7492071Z functorch/test_aotdispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aotdispatch_1.1_aea2f41b0ae17148_.log 2024-08-08T21:04:07.7493144Z Running 0 items in this shard: 2024-08-08T21:04:07.7493402Z 2024-08-08T21:04:07.7493791Z Running inductor/test_cudacodecache 1/1 ... [2024-08-08 21:04:07.748941] 2024-08-08T21:04:07.7497138Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudacodecache.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:07.749294] 2024-08-08T21:04:10.5692436Z 2024-08-08T21:04:10.5694063Z inductor/test_cudacodecache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudacodecache_1.1_0a92d31f2f536eb8_.log 2024-08-08T21:04:10.5695146Z Running 0 items in this shard: 2024-08-08T21:04:10.5695417Z 2024-08-08T21:04:10.5695788Z Running export/test_retraceability 1/1 ... [2024-08-08 21:04:10.569205] 2024-08-08T21:04:10.5700335Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_retraceability.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:10.569563] 2024-08-08T21:04:15.0926921Z 2024-08-08T21:04:15.0928319Z export/test_retraceability 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_retraceability_1.1_5d8ce0917ca3e98e_.log 2024-08-08T21:04:15.0929390Z Running 0 items in this shard: 2024-08-08T21:04:15.0929653Z 2024-08-08T21:04:15.0930286Z Running test_functionalization 1/1 ... [2024-08-08 21:04:15.092683] 2024-08-08T21:04:15.0934922Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_functionalization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:15.093057] 2024-08-08T21:04:17.9630018Z 2024-08-08T21:04:17.9631716Z test_functionalization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_functionalization_1.1_f3c5c5183a648b1d_.log 2024-08-08T21:04:17.9632886Z Running 0 items in this shard: 2024-08-08T21:04:17.9633199Z 2024-08-08T21:04:17.9633482Z Running test_ops 1/8 ... [2024-08-08 21:04:17.963022] 2024-08-08T21:04:17.9637911Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:17.963404] 2024-08-08T21:04:29.5982691Z 2024-08-08T21:04:29.5984796Z test_ops 1/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_1.8_efc9e26a3bf1c507_.log 2024-08-08T21:04:29.5985745Z Running 0 items in this shard: 2024-08-08T21:04:29.5986008Z 2024-08-08T21:04:29.5986288Z Running test_ops 6/8 ... [2024-08-08 21:04:29.597769] 2024-08-08T21:04:29.5987908Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=6', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:29.598122] 2024-08-08T21:04:41.1841631Z 2024-08-08T21:04:41.1842849Z test_ops 6/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_6.8_30418cf82e4b3793_.log 2024-08-08T21:04:41.1844843Z Running 0 items in this shard: 2024-08-08T21:04:41.1845215Z 2024-08-08T21:04:41.1846028Z Running test_ops 7/8 ... [2024-08-08 21:04:41.184189] 2024-08-08T21:04:41.1849806Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'serial', '--shard-id=7', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:41.184546] 2024-08-08T21:04:52.7202287Z 2024-08-08T21:04:52.7203557Z test_ops 7/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_7.8_e7d2144b325fb767_.log 2024-08-08T21:04:52.7204535Z Running 0 items in this shard: 2024-08-08T21:04:52.7204904Z 2024-08-08T21:04:52.7205348Z Running inductor/test_aot_inductor 3/16 ... [2024-08-08 21:04:52.720220] 2024-08-08T21:04:52.7209688Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=3', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:52.720568] 2024-08-08T21:04:59.4984170Z 2024-08-08T21:04:59.4985712Z inductor/test_aot_inductor 3/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_3.16_fe15845382a81ebe_.log 2024-08-08T21:04:59.4987046Z Running 0 items in this shard: 2024-08-08T21:04:59.4987341Z 2024-08-08T21:04:59.4988027Z Running inductor/test_aot_inductor 7/16 ... [2024-08-08 21:04:59.498494] 2024-08-08T21:04:59.4993083Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=7', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:04:59.498910] 2024-08-08T21:05:06.2765920Z 2024-08-08T21:05:06.2767955Z inductor/test_aot_inductor 7/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_7.16_b87b9e27d29daf20_.log 2024-08-08T21:05:06.2769136Z Running 0 items in this shard: 2024-08-08T21:05:06.2769412Z 2024-08-08T21:05:06.2769899Z Running inductor/test_aot_inductor 9/16 ... [2024-08-08 21:05:06.276513] 2024-08-08T21:05:06.2772559Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=9', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:06.276880] 2024-08-08T21:05:13.1039397Z 2024-08-08T21:05:13.1041329Z inductor/test_aot_inductor 9/16 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_9.16_d9a00153b99de1d3_.log 2024-08-08T21:05:13.1042805Z Running 0 items in this shard: 2024-08-08T21:05:13.1043093Z 2024-08-08T21:05:13.1043838Z Running inductor/test_compiled_optimizers 2/4 ... [2024-08-08 21:05:13.104081] 2024-08-08T21:05:13.1049809Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'serial', '--shard-id=2', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:13.104525] 2024-08-08T21:05:20.1816211Z 2024-08-08T21:05:20.1818053Z inductor/test_compiled_optimizers 2/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_2.4_f18894bedfde7652_.log 2024-08-08T21:05:20.1819326Z Running 0 items in this shard: 2024-08-08T21:05:20.1819590Z 2024-08-08T21:05:20.1820003Z Running inductor/test_compiled_optimizers 3/4 ... [2024-08-08 21:05:20.181657] 2024-08-08T21:05:20.1824089Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'serial', '--shard-id=3', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:20.181996] 2024-08-08T21:05:27.2222756Z 2024-08-08T21:05:27.2226152Z inductor/test_compiled_optimizers 3/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_3.4_07003ecdaacb8e39_.log 2024-08-08T21:05:27.2227539Z Running 0 items in this shard: 2024-08-08T21:05:27.2227800Z 2024-08-08T21:05:27.2228229Z Running inductor/test_compiled_optimizers 4/4 ... [2024-08-08 21:05:27.222146] 2024-08-08T21:05:27.2230094Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'serial', '--shard-id=4', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:27.222558] 2024-08-08T21:05:34.2529110Z 2024-08-08T21:05:34.2530855Z inductor/test_compiled_optimizers 4/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_4.4_ef323a71a4a2fb79_.log 2024-08-08T21:05:34.2532121Z Running 0 items in this shard: 2024-08-08T21:05:34.2532490Z 2024-08-08T21:05:34.2532964Z Running inductor/test_cuda_cpp_wrapper 3/4 ... [2024-08-08 21:05:34.252502] 2024-08-08T21:05:34.2534816Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_cpp_wrapper.py', '-m', 'serial', '--shard-id=3', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:34.252881] 2024-08-08T21:05:40.8804617Z 2024-08-08T21:05:40.8806479Z inductor/test_cuda_cpp_wrapper 3/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_cpp_wrapper_3.4_452a71924e6bd520_.log 2024-08-08T21:05:40.8807612Z Running 0 items in this shard: 2024-08-08T21:05:40.8807872Z 2024-08-08T21:05:40.8808342Z Running inductor/test_cpu_cpp_wrapper 1/1 ... [2024-08-08 21:05:40.880281] 2024-08-08T21:05:40.8810590Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_cpp_wrapper.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:40.880661] 2024-08-08T21:05:47.3721739Z 2024-08-08T21:05:47.3723665Z inductor/test_cpu_cpp_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_cpp_wrapper_1.1_78914abce923ead5_.log 2024-08-08T21:05:47.3724628Z 2024-08-08T21:05:47.3797081Z Running test_nestedtensor 1/1 ... [2024-08-08 21:05:47.379096] 2024-08-08T21:05:47.3800158Z Running test_decomp 4/13 ... [2024-08-08 21:05:47.379625] 2024-08-08T21:05:47.3802169Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:47.379673] 2024-08-08T21:05:47.3803856Z Running test_decomp 6/13 ... [2024-08-08 21:05:47.379731] 2024-08-08T21:05:47.3805820Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=4', '--num-shards=13', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:47.380056] 2024-08-08T21:05:47.3808374Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=6', '--num-shards=13', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:05:47.380258] 2024-08-08T21:08:07.3871483Z 2024-08-08T21:08:07.3873439Z test_nestedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.1_343f16bee441ba94_.log 2024-08-08T21:08:07.4631987Z Running 1416 items in this shard: test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_cat, test/test_nestedtensor.py::TestNestedTensor::test_copy_, test/test_nestedtensor.py::TestNestedTensor::test_default_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_dim, test/test_nestedtensor.py::TestNestedTensor::test_fill_, test/test_nestedtensor.py::TestNestedTensor::test_is_contiguous, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_ones_like, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_randn_like, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_zeros_like, test/test_nestedtensor.py::TestNestedTensor::test_nested_namespace, test/test_nestedtensor.py::TestNestedTensor::test_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_nested_tensor_matching_dim, test/test_nestedtensor.py::TestNestedTensor::test_numel, test/test_nestedtensor.py::TestNestedTensor::test_repr_string, test/test_nestedtensor.py::TestNestedTensor::test_size, test/test_nestedtensor.py::TestNestedTensor::test_size_dim, test/test_nestedtensor.py::TestNestedTensor::test_stride, test/test_nestedtensor.py::TestNestedTensor::test_to, test/test_nestedtensor.py::TestNestedTensor::test_to_padded_tensor_on_empty_tensor, test/test_nestedtensor.py::TestNestedTensor::test_unbind_0, test/test_nestedtensor.py::TestNestedTensor::test_unbind_1, test/test_nestedtensor.py::TestNestedTensor::test_unbind_3, test/test_nestedtensor.py::TestNestedTensor::test_unbind_4, test/test_nestedtensor.py::TestNestedTensor::test_unbind_dim, test/test_nestedtensor.py::TestNestedTensor::test_zero_, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_abs__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_abs_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_cos_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_gelu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_gelu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_logical_not_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_neg_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_relu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_relu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_sgn_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_silu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_silu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_sin_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_tanh__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_activations_tanh_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_binary_ops_with_scalar_eq_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_binary_ops_with_scalar_ge_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_clone_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_contiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_contiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_device_checks_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_jagged_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_jagged_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_strided_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_breaking_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_breaking_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_with_bmm_path_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_with_bmm_path_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_256_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_256_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_384_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_384_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_8_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_8_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_div_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_div_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sum_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_4_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_then_from_padded_tensor_no_transform0213_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float64, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_abs_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_accumulate_grad_different_strides_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_as_nested_tensor_propagates_gradients_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_add_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_for_add_op_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_for_sub_op_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_sub_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_gelu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_indexing_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_32_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_edge_case_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1023_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1024_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_256_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_32_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_512_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_513_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_masked_fill_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_bmm_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_bmm_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_list_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_mask_and_to_padded_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_fused_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_generates_leaf_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_plus_transpose_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_reshape_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_reshape_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_softmax_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_squeeze_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_squeeze_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_to_padded_tensor_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_transpose_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_transpose_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_unsqueeze_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_unsqueeze_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_relu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_selu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_set_requires_grad_from_list_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_set_requires_grad_from_mask_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_split_with_sizes_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_to_buffer_series_ops_grad_with_broadcast_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_unbind_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_values_grad_with_broadcast_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_apply__cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_broadcasting_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_transposed_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_chunk_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_preserves_metadata_cache_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_min_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_propagated_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dummy_mha_with_nt_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flatten_decomp_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_contiguous_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_same_size_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_with_pinned_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layout_under_torch_dispatch_mode_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_empty_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_randn_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_ones_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_zeros_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_4_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_5_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_keepdim_False_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_mean_dim_reduce_multiple_dims_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_narrow_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_activation_checkpoint_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_fx_trace_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_pass_min_max_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_pass_min_max_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_njt_cat_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_pin_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_profiler_sequence_nr_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_False_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_True_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_specialize_dynamic_shape_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_specialize_dynamic_shape_recompile_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_with_sizes_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_squeeze_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_tensor_attributes_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_threshold_backward_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_copy_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_transposed_inputs_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_0_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_1_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_equals_2_bad_dim_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_last_dim_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unsafe_view_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_view_ragged_idx_not_one_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_views_inherit_ragged_dim_cuda, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_byte_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_count_nonzero_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_floor_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igammac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isneginf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isposinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isreal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_binary_return_by_ref_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_not_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_xor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_airy_ai_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_j1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_hermite_polynomial_he_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_laguerre_polynomial_l_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_legendre_polynomial_p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_scaled_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_scaled_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_byte_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_count_nonzero_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_floor_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_igammac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isneginf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isposinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isreal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_binary_return_by_ref_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_not_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_xor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_airy_ai_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_he_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_laguerre_polynomial_l_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_legendre_polynomial_p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_scaled_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_scaled_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_xlogy_cuda_float32 2024-08-08T21:08:07.5364455Z 2024-08-08T21:08:10.5424185Z Running test_decomp 7/13 ... [2024-08-08 21:08:10.541736] 2024-08-08T21:08:10.5426620Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=7', '--num-shards=13', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:08:10.542143] 2024-08-08T21:22:13.7454563Z 2024-08-08T21:22:13.7456358Z test_decomp 7/13 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_7.13_9de2549ae9edc5c0_.log 2024-08-08T21:22:13.7827963Z Running 645 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_inverse_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_inverse_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frac_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frac_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_det_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_singular_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_singular_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorsolve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_solve_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_median_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_leaky_relu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mse_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rms_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rms_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pca_lowrank_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_quantile_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_neg_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_blackman_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_kaiser_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triangular_solve_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick__softmax_backward_data_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_block_diag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_clamp_max_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_expand_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_index_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_rad2deg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unsafe_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unsqueeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_frac_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_igamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_igammac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardswish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mse_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mse_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_renorm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_uint8, test/test_decomp.py::HasDecompTest::test_aten_core_operators 2024-08-08T21:22:13.8222817Z 2024-08-08T21:22:16.9544772Z Running inductor/test_torchinductor_opinfo 1/12 ... [2024-08-08 21:22:16.953833] 2024-08-08T21:22:16.9549109Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=1', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:22:16.954265] 2024-08-08T21:23:48.7963004Z 2024-08-08T21:23:48.7964507Z test_decomp 4/13 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_4.13_e03c060950310a19_.log 2024-08-08T21:23:48.8207693Z Running 674 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__batch_norm_with_update_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_left_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igammac_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_householder_product_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_triangular_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_triangular_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_log_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_batch_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_grid_sample_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_number_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pca_lowrank_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pinverse_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_qr_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_svd_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_svd_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick__batch_norm_with_update_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick__upsample_bilinear2d_aa_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_left_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_dot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_index_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_masked_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nan_to_num_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_inf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_rsub_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_stack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_tril_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_grid_sampler_2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_lcm_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_log_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mish_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mish_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_prelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_var_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_int64, test/test_decomp.py::DecompOneOffTestsCUDA::test_threshold_backward_dtype_cuda 2024-08-08T21:23:48.8439357Z 2024-08-08T21:23:51.9383444Z Running inductor/test_torchinductor_opinfo 8/12 ... [2024-08-08 21:23:51.937671] 2024-08-08T21:23:51.9386681Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=8', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:23:51.938100] 2024-08-08T21:29:16.0184381Z 2024-08-08T21:29:16.0185449Z PRINTING LOG FILE of test_decomp 6/13 (test/test-reports/test_decomp_6.13_e203bdf501237379_.log) 2024-08-08T21:29:16.0189075Z Test results will be stored in test-reports/python-pytest/test_decomp/test_decomp-dd9d012de50495e6.xml 2024-08-08T21:29:16.0192003Z ============================= test session starts ============================== 2024-08-08T21:29:16.0193248Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T21:29:16.0194222Z cachedir: .pytest_cache 2024-08-08T21:29:16.0195507Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T21:29:16.0225431Z rootdir: /var/lib/jenkins/workspace 2024-08-08T21:29:16.0225957Z configfile: pytest.ini 2024-08-08T21:29:16.0227139Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T21:29:16.0228273Z collecting ... collected 8818 items 2024-08-08T21:29:16.0228784Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T21:29:16.0513275Z Running 697 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_median_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_complex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick__batch_norm_with_update_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_complex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_glu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_sinc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_t_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_vdot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_zero__cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_frexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_logsigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_renorm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_neg_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_GRU_train_mode_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_eval_mode_cuda_float64 2024-08-08T21:29:16.0786955Z 2024-08-08T21:29:16.0787867Z test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex32 PASSED [0.9416s] [ 0%] 2024-08-08T21:29:16.0789111Z test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_int8 PASSED [0.2357s] [ 0%] 2024-08-08T21:29:16.0790303Z test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_bool PASSED [0.4751s] [ 0%] 2024-08-08T21:29:16.0791458Z test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_float32 PASSED [1.5572s] [ 0%] 2024-08-08T21:29:16.0792769Z test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float32 PASSED [3.7749s] [ 0%] 2024-08-08T21:29:16.0793926Z test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int16 PASSED [0.1357s] [ 0%] 2024-08-08T21:29:16.0795058Z test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int32 PASSED [0.1199s] [ 1%] 2024-08-08T21:29:16.0796184Z test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_uint8 PASSED [0.1407s] [ 1%] 2024-08-08T21:29:16.0797306Z test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int32 PASSED [0.2030s] [ 1%] 2024-08-08T21:29:16.0798926Z test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_float16 PASSED [0.0448s] [ 1%] 2024-08-08T21:29:16.0800989Z test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_int8 PASSED [0.4503s] [ 1%] 2024-08-08T21:29:16.0802375Z test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_bfloat16 PASSED [0.1124s] [ 1%] 2024-08-08T21:29:16.0803665Z test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_bool PASSED [2.0336s] [ 1%] 2024-08-08T21:29:16.0805041Z test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int64 PASSED [1.9303s] [ 2%] 2024-08-08T21:29:16.0806305Z test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex128 PASSED [0.8051s] [ 2%] 2024-08-08T21:29:16.0807455Z test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex64 PASSED [0.7072s] [ 2%] 2024-08-08T21:29:16.0808601Z test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float32 PASSED [0.0589s] [ 2%] 2024-08-08T21:29:16.0809754Z test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex128 PASSED [0.1118s] [ 2%] 2024-08-08T21:29:16.0810889Z test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_uint8 PASSED [0.1286s] [ 2%] 2024-08-08T21:29:16.0812019Z test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_complex128 PASSED [0.5419s] [ 2%] 2024-08-08T21:29:16.0813232Z test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float16 PASSED [0.2137s] [ 3%] 2024-08-08T21:29:16.0814465Z test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_bfloat16 PASSED [0.1925s] [ 3%] 2024-08-08T21:29:16.0816484Z test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_complex64 PASSED [0.0249s] [ 3%] 2024-08-08T21:29:16.0817650Z test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int32 PASSED [0.0440s] [ 3%] 2024-08-08T21:29:16.0818762Z test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int8 PASSED [0.0456s] [ 3%] 2024-08-08T21:29:16.0819872Z test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_bool PASSED [0.0475s] [ 3%] 2024-08-08T21:29:16.0820982Z test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_uint8 PASSED [0.0445s] [ 3%] 2024-08-08T21:29:16.0822119Z test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float32 PASSED [0.1937s] [ 4%] 2024-08-08T21:29:16.0823270Z test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_float16 PASSED [0.0148s] [ 4%] 2024-08-08T21:29:16.0824409Z test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_int64 PASSED [0.0106s] [ 4%] 2024-08-08T21:29:16.0825541Z test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int32 PASSED [0.0485s] [ 4%] 2024-08-08T21:29:16.0826678Z test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_uint8 PASSED [0.0248s] [ 4%] 2024-08-08T21:29:16.0827819Z test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int8 PASSED [0.0083s] [ 4%] 2024-08-08T21:29:16.0829025Z test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_int8 PASSED [0.0056s] [ 4%] 2024-08-08T21:29:16.0830532Z test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_float32 SKIPPED [0.0002s] (Expected: new_empty_strided is not comparable) [ 5%] 2024-08-08T21:29:16.0832012Z test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_int64 PASSED [0.0322s] [ 5%] 2024-08-08T21:29:16.0833140Z test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_float32 PASSED [0.1876s] [ 5%] 2024-08-08T21:29:16.0834545Z test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bool PASSED [0.0295s] [ 5%] 2024-08-08T21:29:16.0835712Z test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_bfloat16 PASSED [0.0115s] [ 5%] 2024-08-08T21:29:16.0837010Z test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float32 PASSED [0.0172s] [ 5%] 2024-08-08T21:29:16.0838189Z test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int16 PASSED [0.0178s] [ 5%] 2024-08-08T21:29:16.0839357Z test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_bfloat16 PASSED [0.0222s] [ 6%] 2024-08-08T21:29:16.0840542Z test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_bfloat16 PASSED [0.0090s] [ 6%] 2024-08-08T21:29:16.0841748Z test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_complex128 PASSED [0.6811s] [ 6%] 2024-08-08T21:29:16.0842917Z test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_float32 PASSED [0.0274s] [ 6%] 2024-08-08T21:29:16.0844154Z test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex128 PASSED [0.1670s] [ 6%] 2024-08-08T21:29:16.0845374Z test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int64 PASSED [0.0170s] [ 6%] 2024-08-08T21:29:16.0846551Z test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int16 PASSED [9.0884s] [ 6%] 2024-08-08T21:29:16.0847739Z test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_float16 PASSED [0.0334s] [ 7%] 2024-08-08T21:29:16.0848959Z test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int64 PASSED [0.0455s] [ 7%] 2024-08-08T21:29:16.0850151Z test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_complex32 PASSED [0.0513s] [ 7%] 2024-08-08T21:29:16.0851302Z test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int64 PASSED [0.0266s] [ 7%] 2024-08-08T21:29:16.0852406Z test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int8 PASSED [0.0264s] [ 7%] 2024-08-08T21:29:16.0853553Z test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex64 PASSED [0.0270s] [ 7%] 2024-08-08T21:29:16.0854757Z test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_uint8 PASSED [0.0367s] [ 7%] 2024-08-08T21:29:16.0855894Z test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_complex32 PASSED [0.0288s] [ 8%] 2024-08-08T21:29:16.0857047Z test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_complex128 PASSED [0.0418s] [ 8%] 2024-08-08T21:29:16.0858186Z test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int32 PASSED [0.0208s] [ 8%] 2024-08-08T21:29:16.0859332Z test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex128 PASSED [0.7387s] [ 8%] 2024-08-08T21:29:16.0860496Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float64 PASSED [2.0872s] [ 8%] 2024-08-08T21:29:16.0861662Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_int64 PASSED [0.7391s] [ 8%] 2024-08-08T21:29:16.0862819Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bool PASSED [0.7432s] [ 8%] 2024-08-08T21:29:16.0863989Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float16 PASSED [0.4016s] [ 9%] 2024-08-08T21:29:16.0865128Z test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int32 PASSED [0.0243s] [ 9%] 2024-08-08T21:29:16.0866300Z test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bfloat16 PASSED [0.0157s] [ 9%] 2024-08-08T21:29:16.0867628Z test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_complex128 PASSED [3.0570s] [ 9%] 2024-08-08T21:29:16.0868847Z test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_bfloat16 PASSED [0.0050s] [ 9%] 2024-08-08T21:29:16.0870152Z test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_bfloat16 PASSED [0.1420s] [ 9%] 2024-08-08T21:29:16.0871356Z test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int8 PASSED [0.0050s] [ 9%] 2024-08-08T21:29:16.0872625Z test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_complex32 PASSED [0.5665s] [ 10%] 2024-08-08T21:29:16.0873754Z test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_bfloat16 PASSED [1.5343s] [ 10%] 2024-08-08T21:29:16.0874921Z test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_int8 PASSED [2.3769s] [ 10%] 2024-08-08T21:29:16.0876053Z test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_float32 PASSED [0.0359s] [ 10%] 2024-08-08T21:29:16.0877211Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_bfloat16 PASSED [4.4666s] [ 10%] 2024-08-08T21:29:16.0878348Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_int32 PASSED [0.2691s] [ 10%] 2024-08-08T21:29:16.0879553Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int32 PASSED [0.3536s] [ 10%] 2024-08-08T21:29:16.0880815Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8 PASSED [0.3578s] [ 11%] 2024-08-08T21:29:16.0882016Z test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_float32 PASSED [0.0949s] [ 11%] 2024-08-08T21:29:16.0883212Z test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_complex64 PASSED [0.8565s] [ 11%] 2024-08-08T21:29:16.0884437Z test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int64 PASSED [0.1372s] [ 11%] 2024-08-08T21:29:16.0885589Z test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int8 PASSED [2.1514s] [ 11%] 2024-08-08T21:29:16.0886701Z test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bool PASSED [0.2497s] [ 11%] 2024-08-08T21:29:16.0887909Z test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_bfloat16 PASSED [0.0450s] [ 11%] 2024-08-08T21:29:16.0889170Z test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_float16 PASSED [0.0437s] [ 12%] 2024-08-08T21:29:16.0890412Z test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int16 PASSED [0.1204s] [ 12%] 2024-08-08T21:29:16.0891597Z test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_float32 PASSED [0.0558s] [ 12%] 2024-08-08T21:29:16.0892745Z test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_int32 PASSED [0.0332s] [ 12%] 2024-08-08T21:29:16.0893892Z test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_bfloat16 PASSED [0.0239s] [ 12%] 2024-08-08T21:29:16.0895055Z test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex128 PASSED [0.3311s] [ 12%] 2024-08-08T21:29:16.0896403Z test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int64 SKIPPED [0.0024s] (empty in torch.int64 not supported) [ 12%] 2024-08-08T21:29:16.0897930Z test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int64 SKIPPED [0.0022s] (empty_like in torch.int64 not supported) [ 13%] 2024-08-08T21:29:16.0899524Z test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int32 SKIPPED [0.0022s] (empty_strided in torch.int32 not supported) [ 13%] 2024-08-08T21:29:16.0900866Z test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float32 PASSED [0.2006s] [ 13%] 2024-08-08T21:29:16.0902072Z test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int8 PASSED [0.1235s] [ 13%] 2024-08-08T21:29:16.0903171Z test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_int8 PASSED [0.0086s] [ 13%] 2024-08-08T21:29:16.0904377Z test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_bfloat16 PASSED [0.0146s] [ 13%] 2024-08-08T21:29:16.0905505Z test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_bfloat16 PASSED [0.1889s] [ 13%] 2024-08-08T21:29:16.0906634Z test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_uint8 PASSED [0.1460s] [ 14%] 2024-08-08T21:29:16.0907746Z test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int32 PASSED [0.1369s] [ 14%] 2024-08-08T21:29:16.0908843Z test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int32 PASSED [0.0094s] [ 14%] 2024-08-08T21:29:16.0909986Z test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int64 PASSED [0.0088s] [ 14%] 2024-08-08T21:29:16.0911140Z test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_bfloat16 PASSED [0.0247s] [ 14%] 2024-08-08T21:29:16.0912375Z test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_complex128 PASSED [0.2904s] [ 14%] 2024-08-08T21:29:16.0913525Z test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float32 PASSED [0.1268s] [ 14%] 2024-08-08T21:29:16.0914653Z test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_uint8 PASSED [0.0268s] [ 15%] 2024-08-08T21:29:16.0916198Z test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_float64 PASSED [0.0079s] [ 15%] 2024-08-08T21:29:16.0917345Z test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float64 PASSED [2.8680s] [ 15%] 2024-08-08T21:29:16.0918472Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int8 PASSED [0.5963s] [ 15%] 2024-08-08T21:29:16.0919651Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex64 PASSED [0.7501s] [ 15%] 2024-08-08T21:29:16.0920849Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float64 PASSED [2.4894s] [ 15%] 2024-08-08T21:29:16.0921999Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_bool PASSED [0.6038s] [ 15%] 2024-08-08T21:29:16.0923147Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_uint8 PASSED [0.6054s] [ 16%] 2024-08-08T21:29:16.0924291Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_bool PASSED [0.2789s] [ 16%] 2024-08-08T21:29:16.0925429Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_bool PASSED [0.2745s] [ 16%] 2024-08-08T21:29:16.0926596Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_complex64 PASSED [0.7349s] [ 16%] 2024-08-08T21:29:16.0927777Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int8 PASSED [0.1622s] [ 16%] 2024-08-08T21:29:16.0928959Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_float64 PASSED [1.0216s] [ 16%] 2024-08-08T21:29:16.0930124Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int64 PASSED [0.1728s] [ 16%] 2024-08-08T21:29:16.0931265Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_bool PASSED [0.2835s] [ 17%] 2024-08-08T21:29:16.0932403Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_int32 PASSED [0.3514s] [ 17%] 2024-08-08T21:29:16.0933534Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_bool PASSED [0.2583s] [ 17%] 2024-08-08T21:29:16.0934950Z test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float16 PASSED [0.0126s] [ 17%] 2024-08-08T21:29:16.0936072Z test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_bool PASSED [0.1044s] [ 17%] 2024-08-08T21:29:16.0937180Z test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int8 PASSED [0.1038s] [ 17%] 2024-08-08T21:29:16.0938407Z test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int64 PASSED [0.0244s] [ 17%] 2024-08-08T21:29:16.0939528Z test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_uint8 PASSED [0.0350s] [ 18%] 2024-08-08T21:29:16.0940686Z test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_float64 PASSED [2.8001s] [ 18%] 2024-08-08T21:29:16.0941830Z test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_uint8 PASSED [0.0269s] [ 18%] 2024-08-08T21:29:16.0942935Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int32 PASSED [0.1220s] [ 18%] 2024-08-08T21:29:16.0944059Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_bfloat16 PASSED [0.4019s] [ 18%] 2024-08-08T21:29:16.0945180Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int16 PASSED [0.1221s] [ 18%] 2024-08-08T21:29:16.0946295Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int8 PASSED [0.1213s] [ 18%] 2024-08-08T21:29:16.0947404Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bfloat16 PASSED [0.0098s] [ 19%] 2024-08-08T21:29:16.0948517Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bool PASSED [0.0147s] [ 19%] 2024-08-08T21:29:16.0949637Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_int8 PASSED [0.0082s] [ 19%] 2024-08-08T21:29:16.0950786Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_uint8 PASSED [0.0078s] [ 19%] 2024-08-08T21:29:16.0952009Z test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_bfloat16 PASSED [0.0200s] [ 19%] 2024-08-08T21:29:16.0953152Z test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_uint8 PASSED [0.0083s] [ 19%] 2024-08-08T21:29:16.0954269Z test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_int64 PASSED [0.1203s] [ 19%] 2024-08-08T21:29:16.0955348Z test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_uint8 PASSED [0.1213s] [ 20%] 2024-08-08T21:29:16.0956472Z test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int8 PASSED [0.0078s] [ 20%] 2024-08-08T21:29:16.0957624Z test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_complex64 PASSED [0.0710s] [ 20%] 2024-08-08T21:29:16.0958741Z test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_uint8 PASSED [0.1200s] [ 20%] 2024-08-08T21:29:16.0959857Z test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_float32 PASSED [0.0646s] [ 20%] 2024-08-08T21:29:16.0960980Z test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int8 PASSED [0.0460s] [ 20%] 2024-08-08T21:29:16.0962115Z test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex32 PASSED [0.0057s] [ 20%] 2024-08-08T21:29:16.0963272Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_int16 PASSED [0.0159s] [ 21%] 2024-08-08T21:29:16.0964460Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_complex128 PASSED [0.4015s] [ 21%] 2024-08-08T21:29:16.0965662Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_complex32 PASSED [0.0076s] [ 21%] 2024-08-08T21:29:16.0966824Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int32 PASSED [0.0071s] [ 21%] 2024-08-08T21:29:16.0968022Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float64 PASSED [1.1593s] [ 21%] 2024-08-08T21:29:16.0969371Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int64 PASSED [0.0100s] [ 21%] 2024-08-08T21:29:16.0970597Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex64 PASSED [0.1280s] [ 21%] 2024-08-08T21:29:16.0971873Z test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_complex128 PASSED [0.1708s] [ 22%] 2024-08-08T21:29:16.0973048Z test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex128 PASSED [0.0744s] [ 22%] 2024-08-08T21:29:16.0974230Z test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_float64 PASSED [0.0353s] [ 22%] 2024-08-08T21:29:16.0975390Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex128 PASSED [0.2346s] [ 22%] 2024-08-08T21:29:16.0976565Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex64 PASSED [0.2358s] [ 22%] 2024-08-08T21:29:16.0977692Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int16 PASSED [0.0481s] [ 22%] 2024-08-08T21:29:16.0978806Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int32 PASSED [0.0489s] [ 22%] 2024-08-08T21:29:16.0979927Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_uint8 PASSED [0.0486s] [ 23%] 2024-08-08T21:29:16.0981050Z test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int8 PASSED [0.0479s] [ 23%] 2024-08-08T21:29:16.0982188Z test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int16 PASSED [0.0481s] [ 23%] 2024-08-08T21:29:16.0983324Z test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int32 PASSED [0.0483s] [ 23%] 2024-08-08T21:29:16.0984487Z test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex128 PASSED [0.0133s] [ 23%] 2024-08-08T21:29:16.0985657Z test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex64 PASSED [0.0125s] [ 23%] 2024-08-08T21:29:16.0986811Z test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float16 PASSED [0.0097s] [ 23%] 2024-08-08T21:29:16.0988127Z test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_uint8 SKIPPED [0.0023s] (item in torch.uint8 not supported) [ 24%] 2024-08-08T21:29:16.0989504Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int16 PASSED [0.3669s] [ 24%] 2024-08-08T21:29:16.0990850Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bfloat16 PASSED [0.4081s] [ 24%] 2024-08-08T21:29:16.0992304Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool PASSED [0.4601s] [ 24%] 2024-08-08T21:29:16.0993696Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_complex128 PASSED [0.8155s] [ 24%] 2024-08-08T21:29:16.0994986Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_int8 PASSED [0.0506s] [ 24%] 2024-08-08T21:29:16.0996172Z test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_float32 PASSED [0.1328s] [ 24%] 2024-08-08T21:29:16.0997307Z test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_uint8 PASSED [0.4346s] [ 25%] 2024-08-08T21:29:16.0998406Z test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_int8 PASSED [0.1220s] [ 25%] 2024-08-08T21:29:16.0999553Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_cuda_float32 PASSED [1.8219s] [ 25%] 2024-08-08T21:29:16.1000793Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64 PASSED [1.8006s] [ 25%] 2024-08-08T21:29:16.1002134Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_complex128 PASSED [1.5832s] [ 25%] 2024-08-08T21:29:16.1003331Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int64 PASSED [0.1972s] [ 25%] 2024-08-08T21:29:16.1004599Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int64 PASSED [0.0637s] [ 25%] 2024-08-08T21:29:16.1005814Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_complex128 PASSED [2.2455s] [ 26%] 2024-08-08T21:29:16.1007023Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_float64 PASSED [0.2714s] [ 26%] 2024-08-08T21:29:16.1008228Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_float32 PASSED [0.2561s] [ 26%] 2024-08-08T21:29:16.1009440Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32 PASSED [0.0096s] [ 26%] 2024-08-08T21:29:16.1010685Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_solve_cuda_float32 PASSED [0.0164s] [ 26%] 2024-08-08T21:29:16.1011902Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_complex64 PASSED [8.2962s] [ 26%] 2024-08-08T21:29:16.1013114Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_complex64 PASSED [3.3646s] [ 26%] 2024-08-08T21:29:16.1014277Z test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_bool PASSED [0.0097s] [ 27%] 2024-08-08T21:29:16.1015768Z test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_float64 PASSED [0.0319s] [ 27%] 2024-08-08T21:29:16.1016919Z test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int32 PASSED [0.0094s] [ 27%] 2024-08-08T21:29:16.1018042Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bfloat16 PASSED [0.0100s] [ 27%] 2024-08-08T21:29:16.1019164Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bool PASSED [0.0096s] [ 27%] 2024-08-08T21:29:16.1020286Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex64 PASSED [0.1907s] [ 27%] 2024-08-08T21:29:16.1021503Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float64 PASSED [0.7259s] [ 27%] 2024-08-08T21:29:16.1022776Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_uint8 PASSED [0.6841s] [ 28%] 2024-08-08T21:29:16.1024028Z test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_complex128 PASSED [1.2936s] [ 28%] 2024-08-08T21:29:16.1025250Z test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float16 PASSED [0.0071s] [ 28%] 2024-08-08T21:29:16.1026427Z test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_complex64 PASSED [2.8288s] [ 28%] 2024-08-08T21:29:16.1027593Z test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_float64 PASSED [0.4379s] [ 28%] 2024-08-08T21:29:16.1028779Z test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex128 PASSED [0.0265s] [ 28%] 2024-08-08T21:29:16.1029989Z test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_float32 PASSED [0.0164s] [ 28%] 2024-08-08T21:29:16.1031189Z test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_complex64 PASSED [1.5149s] [ 29%] 2024-08-08T21:29:16.1032459Z test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_complex64 PASSED [6.4986s] [ 29%] 2024-08-08T21:29:16.1033719Z test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_float64 PASSED [31.5739s] [ 29%] 2024-08-08T21:29:16.1034962Z test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_float64 PASSED [1.6800s] [ 29%] 2024-08-08T21:29:16.1036300Z test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float64 PASSED [0.0319s] [ 29%] 2024-08-08T21:29:16.1037417Z test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_uint8 PASSED [0.0206s] [ 29%] 2024-08-08T21:29:16.1038677Z test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_bfloat16 PASSED [0.0838s] [ 29%] 2024-08-08T21:29:16.1039774Z test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_bool PASSED [0.2255s] [ 30%] 2024-08-08T21:29:16.1040915Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int16 PASSED [0.6422s] [ 30%] 2024-08-08T21:29:16.1042111Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float64 PASSED [0.6344s] [ 30%] 2024-08-08T21:29:16.1043316Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_uint8 PASSED [0.3679s] [ 30%] 2024-08-08T21:29:16.1044530Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_bfloat16 PASSED [2.2810s] [ 30%] 2024-08-08T21:29:16.1045766Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_float64 PASSED [20.9082s] [ 30%] 2024-08-08T21:29:16.1046997Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_complex64 PASSED [0.5926s] [ 30%] 2024-08-08T21:29:16.1048181Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_int16 PASSED [0.0887s] [ 31%] 2024-08-08T21:29:16.1049387Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_float16 PASSED [0.7853s] [ 31%] 2024-08-08T21:29:16.1050583Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_int64 PASSED [0.9793s] [ 31%] 2024-08-08T21:29:16.1051771Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_median_cuda_float16 PASSED [0.1367s] [ 31%] 2024-08-08T21:29:16.1052975Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_bfloat16 PASSED [6.6631s] [ 31%] 2024-08-08T21:29:16.1054219Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_complex128 PASSED [17.9139s] [ 31%] 2024-08-08T21:29:16.1055458Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_float64 PASSED [24.1644s] [ 31%] 2024-08-08T21:29:16.1056640Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int64 PASSED [2.4831s] [ 32%] 2024-08-08T21:29:16.1057795Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_uint8 PASSED [0.8735s] [ 32%] 2024-08-08T21:29:16.1058998Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_float16 PASSED [0.0250s] [ 32%] 2024-08-08T21:29:16.1060225Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmax_cuda_float32 PASSED [3.8762s] [ 32%] 2024-08-08T21:29:16.1061433Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_bfloat16 PASSED [1.3276s] [ 32%] 2024-08-08T21:29:16.1062613Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_float64 PASSED [12.3373s] [ 32%] 2024-08-08T21:29:16.1063801Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_bfloat16 PASSED [1.3360s] [ 32%] 2024-08-08T21:29:16.1064986Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_float64 PASSED [13.2085s] [ 33%] 2024-08-08T21:29:16.1066178Z test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_complex64 PASSED [0.7960s] [ 33%] 2024-08-08T21:29:16.1067377Z test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float64 PASSED [0.4012s] [ 33%] 2024-08-08T21:29:16.1068553Z test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_int32 PASSED [0.1223s] [ 33%] 2024-08-08T21:29:16.1069863Z test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int16 PASSED [0.0054s] [ 33%] 2024-08-08T21:29:16.1071129Z test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_float64 PASSED [0.0561s] [ 33%] 2024-08-08T21:29:16.1072545Z test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float16 PASSED [0.6640s] [ 34%] 2024-08-08T21:29:16.1073690Z test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_int64 PASSED [0.1231s] [ 34%] 2024-08-08T21:29:16.1075182Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_complex128 SKIPPED [0.0023s] (meshgrid in torch.complex128 not supported) [ 34%] 2024-08-08T21:29:16.1076896Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int64 SKIPPED [0.0023s] (meshgrid in torch.int64 not supported) [ 34%] 2024-08-08T21:29:16.1078632Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_complex128 SKIPPED [0.0022s] (meshgrid in torch.complex128 not supported) [ 34%] 2024-08-08T21:29:16.1080365Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int16 SKIPPED [0.0022s] (meshgrid in torch.int16 not supported) [ 34%] 2024-08-08T21:29:16.1081810Z test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bool PASSED [0.0052s] [ 34%] 2024-08-08T21:29:16.1082994Z test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_uint8 PASSED [0.1236s] [ 35%] 2024-08-08T21:29:16.1084119Z test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_int32 PASSED [0.0245s] [ 35%] 2024-08-08T21:29:16.1085249Z test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int16 PASSED [0.0159s] [ 35%] 2024-08-08T21:29:16.1086397Z test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_bfloat16 PASSED [0.0106s] [ 35%] 2024-08-08T21:29:16.1087530Z test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float32 PASSED [0.1192s] [ 35%] 2024-08-08T21:29:16.1088664Z test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float64 PASSED [0.1220s] [ 35%] 2024-08-08T21:29:16.1089789Z test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float64 PASSED [0.1245s] [ 35%] 2024-08-08T21:29:16.1090938Z test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_complex128 PASSED [6.9756s] [ 36%] 2024-08-08T21:29:16.1092104Z test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_float32 PASSED [2.2942s] [ 36%] 2024-08-08T21:29:16.1093268Z test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float64 PASSED [0.2010s] [ 36%] 2024-08-08T21:29:16.1094446Z test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_complex32 PASSED [7.6685s] [ 36%] 2024-08-08T21:29:16.1095583Z test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int8 PASSED [0.3133s] [ 36%] 2024-08-08T21:29:16.1096755Z test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex32 PASSED [0.1849s] [ 36%] 2024-08-08T21:29:16.1097961Z test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float32 PASSED [0.1114s] [ 36%] 2024-08-08T21:29:16.1099176Z test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_bfloat16 PASSED [0.4875s] [ 37%] 2024-08-08T21:29:16.1100337Z test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_bool PASSED [0.1228s] [ 37%] 2024-08-08T21:29:16.1101674Z test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int32 SKIPPED [0.0024s] (new_empty in torch.int32 not supported) [ 37%] 2024-08-08T21:29:16.1103119Z test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_complex64 PASSED [0.1026s] [ 37%] 2024-08-08T21:29:16.1104409Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float16 PASSED [0.0207s] [ 37%] 2024-08-08T21:29:16.1105876Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float64 PASSED [0.0234s] [ 37%] 2024-08-08T21:29:16.1107258Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float32 PASSED [0.0203s] [ 37%] 2024-08-08T21:29:16.1108694Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float64 PASSED [4.4536s] [ 38%] 2024-08-08T21:29:16.1110092Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int16 PASSED [0.3146s] [ 38%] 2024-08-08T21:29:16.1111400Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_float32 PASSED [0.2879s] [ 38%] 2024-08-08T21:29:16.1112749Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_complex32 PASSED [5.8297s] [ 38%] 2024-08-08T21:29:16.1114036Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_float32 PASSED [1.1089s] [ 38%] 2024-08-08T21:29:16.1115692Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_bfloat16 PASSED [0.1381s] [ 38%] 2024-08-08T21:29:16.1117090Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64 PASSED [0.4491s] [ 38%] 2024-08-08T21:29:16.1118466Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int8 PASSED [0.4300s] [ 39%] 2024-08-08T21:29:16.1119795Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32 PASSED [0.6553s] [ 39%] 2024-08-08T21:29:16.1121094Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_bfloat16 PASSED [0.0628s] [ 39%] 2024-08-08T21:29:16.1122385Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float16 PASSED [0.0617s] [ 39%] 2024-08-08T21:29:16.1123796Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [2.1044s] [ 39%] 2024-08-08T21:29:16.1125305Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float32 PASSED [0.0146s] [ 39%] 2024-08-08T21:29:16.1126750Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16 PASSED [0.0515s] [ 39%] 2024-08-08T21:29:16.1128160Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_bfloat16 PASSED [0.0674s] [ 40%] 2024-08-08T21:29:16.1129553Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float16 PASSED [10.9659s] [ 40%] 2024-08-08T21:29:16.1130876Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_float64 PASSED [8.4345s] [ 40%] 2024-08-08T21:29:16.1132175Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float16 PASSED [0.1832s] [ 40%] 2024-08-08T21:29:16.1133454Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int8 PASSED [0.1837s] [ 40%] 2024-08-08T21:29:16.1134786Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float64 PASSED [2.0878s] [ 40%] 2024-08-08T21:29:16.1136149Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_bfloat16 PASSED [0.4664s] [ 40%] 2024-08-08T21:29:16.1137670Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32 PASSED [3.5017s] [ 41%] 2024-08-08T21:29:16.1139083Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float32 PASSED [1.7996s] [ 41%] 2024-08-08T21:29:16.1140610Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64 PASSED [1.8359s] [ 41%] 2024-08-08T21:29:16.1141967Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float32 PASSED [23.8359s] [ 41%] 2024-08-08T21:29:16.1143275Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float32 PASSED [6.4204s] [ 41%] 2024-08-08T21:29:16.1144621Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float32 PASSED [0.5632s] [ 41%] 2024-08-08T21:29:16.1145935Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float16 PASSED [0.0751s] [ 41%] 2024-08-08T21:29:16.1147289Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32 PASSED [1.0040s] [ 42%] 2024-08-08T21:29:16.1148708Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_bfloat16 PASSED [0.3901s] [ 42%] 2024-08-08T21:29:16.1150116Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16 PASSED [0.3018s] [ 42%] 2024-08-08T21:29:16.1151505Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_complex64 PASSED [5.7123s] [ 42%] 2024-08-08T21:29:16.1152911Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float64 PASSED [3.8646s] [ 42%] 2024-08-08T21:29:16.1154474Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_bfloat16 SKIPPED [0.0002s] (Expected: new_empty_strided is not comparable) [ 42%] 2024-08-08T21:29:16.1156018Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_uint8 PASSED [0.1743s] [ 42%] 2024-08-08T21:29:16.1157381Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_complex128 PASSED [0.2091s] [ 43%] 2024-08-08T21:29:16.1158946Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int64 SKIPPED [0.0025s] (nn.functional.relu6 in torch.int64 not supported) [ 43%] 2024-08-08T21:29:16.1160432Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_bfloat16 PASSED [0.0921s] [ 43%] 2024-08-08T21:29:16.1161684Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_cuda_float64 PASSED [0.2161s] [ 43%] 2024-08-08T21:29:16.1162991Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16 PASSED [0.2398s] [ 43%] 2024-08-08T21:29:16.1164284Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float64 PASSED [0.8945s] [ 43%] 2024-08-08T21:29:16.1165889Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float16 SKIPPED [0.0023s] (nn.functional.softshrink in torch.float16 not supported) [ 43%] 2024-08-08T21:29:16.1167465Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_complex64 PASSED [0.4724s] [ 44%] 2024-08-08T21:29:16.1168752Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_uint8 PASSED [0.0190s] [ 44%] 2024-08-08T21:29:16.1170035Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float32 PASSED [0.1920s] [ 44%] 2024-08-08T21:29:16.1171532Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int8 PASSED [1.7422s] [ 44%] 2024-08-08T21:29:16.1172947Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float32 PASSED [2.3374s] [ 44%] 2024-08-08T21:29:16.1174440Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float16 PASSED [0.2232s] [ 44%] 2024-08-08T21:29:16.1175826Z test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 44%] 2024-08-08T21:29:16.1177238Z test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float64 SKIPPED [0.0009s] (Only runs on cpu) [ 45%] 2024-08-08T21:29:16.1178627Z test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_int16 SKIPPED [0.0009s] (Only runs on cpu) [ 45%] 2024-08-08T21:29:16.1180107Z test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_complex64 SKIPPED [0.0023s] (norm in torch.complex64 not supported) [ 45%] 2024-08-08T21:29:16.1181644Z test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_float16 SKIPPED [0.0023s] (norm in torch.float16 not supported) [ 45%] 2024-08-08T21:29:16.1183174Z test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float64 SKIPPED [0.0023s] (norm in torch.float64 not supported) [ 45%] 2024-08-08T21:29:16.1184557Z test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex128 PASSED [0.0221s] [ 45%] 2024-08-08T21:29:16.1185711Z test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex32 PASSED [0.0277s] [ 45%] 2024-08-08T21:29:16.1186849Z test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_float32 PASSED [0.0762s] [ 46%] 2024-08-08T21:29:16.1187979Z test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_uint8 PASSED [0.0111s] [ 46%] 2024-08-08T21:29:16.1189193Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float64 PASSED [1.3050s] [ 46%] 2024-08-08T21:29:16.1190476Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int16 PASSED [0.7527s] [ 46%] 2024-08-08T21:29:16.1191822Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bool PASSED [0.0088s] [ 46%] 2024-08-08T21:29:16.1193092Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_uint8 PASSED [0.0089s] [ 46%] 2024-08-08T21:29:16.1194372Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16 PASSED [1.0788s] [ 46%] 2024-08-08T21:29:16.1195579Z test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float64 PASSED [0.0051s] [ 47%] 2024-08-08T21:29:16.1196739Z test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_complex128 PASSED [0.4748s] [ 47%] 2024-08-08T21:29:16.1197897Z test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_bfloat16 PASSED [0.0081s] [ 47%] 2024-08-08T21:29:16.1199071Z test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float64 PASSED [0.0119s] [ 47%] 2024-08-08T21:29:16.1200219Z test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int16 PASSED [0.0186s] [ 47%] 2024-08-08T21:29:16.1201349Z test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int32 PASSED [0.0183s] [ 47%] 2024-08-08T21:29:16.1202481Z test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int64 PASSED [0.0181s] [ 47%] 2024-08-08T21:29:16.1203613Z test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_bfloat16 PASSED [0.0060s] [ 48%] 2024-08-08T21:29:16.1204837Z test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_bool PASSED [0.0462s] [ 48%] 2024-08-08T21:29:16.1205968Z test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex64 PASSED [0.1045s] [ 48%] 2024-08-08T21:29:16.1207092Z test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int64 PASSED [0.0055s] [ 48%] 2024-08-08T21:29:16.1208296Z test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int16 PASSED [0.1255s] [ 48%] 2024-08-08T21:29:16.1209442Z test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_uint8 PASSED [0.1265s] [ 48%] 2024-08-08T21:29:16.1210583Z test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int64 PASSED [0.1579s] [ 48%] 2024-08-08T21:29:16.1211703Z test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_uint8 PASSED [0.1589s] [ 49%] 2024-08-08T21:29:16.1212902Z test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex32 PASSED [0.1478s] [ 49%] 2024-08-08T21:29:16.1214126Z test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int32 PASSED [0.0279s] [ 49%] 2024-08-08T21:29:16.1215648Z test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int32 PASSED [0.0586s] [ 49%] 2024-08-08T21:29:16.1216877Z test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_float64 PASSED [0.0055s] [ 49%] 2024-08-08T21:29:16.1218058Z test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int16 PASSED [0.0052s] [ 49%] 2024-08-08T21:29:16.1219217Z test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float32 PASSED [3.2012s] [ 49%] 2024-08-08T21:29:16.1220335Z test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int16 PASSED [1.0620s] [ 50%] 2024-08-08T21:29:16.1221448Z test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float32 PASSED [0.0696s] [ 50%] 2024-08-08T21:29:16.1222586Z test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_float16 PASSED [0.0155s] [ 50%] 2024-08-08T21:29:16.1223707Z test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float16 PASSED [0.0532s] [ 50%] 2024-08-08T21:29:16.1224875Z test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float32 PASSED [0.0055s] [ 50%] 2024-08-08T21:29:16.1226091Z test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int16 PASSED [0.0186s] [ 50%] 2024-08-08T21:29:16.1227337Z test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int64 PASSED [0.0182s] [ 50%] 2024-08-08T21:29:16.1228580Z test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_float64 PASSED [3.6158s] [ 51%] 2024-08-08T21:29:16.1229770Z test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int32 PASSED [0.0066s] [ 51%] 2024-08-08T21:29:16.1230874Z test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int16 PASSED [0.0507s] [ 51%] 2024-08-08T21:29:16.1232054Z test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_float16 PASSED [0.0070s] [ 51%] 2024-08-08T21:29:16.1233295Z test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_exponential_cuda_float32 PASSED [0.2803s] [ 51%] 2024-08-08T21:29:16.1234625Z test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float64 PASSED [0.5141s] [ 51%] 2024-08-08T21:29:16.1235854Z test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_bool PASSED [0.0281s] [ 51%] 2024-08-08T21:29:16.1236975Z test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int64 PASSED [0.0276s] [ 52%] 2024-08-08T21:29:16.1238109Z test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex128 PASSED [0.8665s] [ 52%] 2024-08-08T21:29:16.1239412Z test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_float16 PASSED [0.0089s] [ 52%] 2024-08-08T21:29:16.1240534Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float32 PASSED [0.2039s] [ 52%] 2024-08-08T21:29:16.1241754Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int8 PASSED [0.2660s] [ 52%] 2024-08-08T21:29:16.1242868Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float16 PASSED [0.0097s] [ 52%] 2024-08-08T21:29:16.1243980Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float64 PASSED [0.1307s] [ 52%] 2024-08-08T21:29:16.1245110Z test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bfloat16 PASSED [0.3304s] [ 53%] 2024-08-08T21:29:16.1246261Z test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex128 PASSED [2.8296s] [ 53%] 2024-08-08T21:29:16.1247446Z test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int32 PASSED [2.9359s] [ 53%] 2024-08-08T21:29:16.1248688Z test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex64 PASSED [0.2351s] [ 53%] 2024-08-08T21:29:16.1249937Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_bool PASSED [0.3556s] [ 53%] 2024-08-08T21:29:16.1251170Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_float32 PASSED [0.0068s] [ 53%] 2024-08-08T21:29:16.1252397Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int8 PASSED [0.3320s] [ 53%] 2024-08-08T21:29:16.1253689Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32 PASSED [1.0747s] [ 54%] 2024-08-08T21:29:16.1254941Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_bool PASSED [0.2599s] [ 54%] 2024-08-08T21:29:16.1256134Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_uint8 PASSED [0.9808s] [ 54%] 2024-08-08T21:29:16.1257389Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_uint8 PASSED [0.9255s] [ 54%] 2024-08-08T21:29:16.1258719Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_uint8 PASSED [0.9192s] [ 54%] 2024-08-08T21:29:16.1259957Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_bool PASSED [0.1858s] [ 54%] 2024-08-08T21:29:16.1261112Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_float64 PASSED [0.3770s] [ 54%] 2024-08-08T21:29:16.1262280Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int16 PASSED [0.0082s] [ 55%] 2024-08-08T21:29:16.1263456Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_float64 PASSED [0.4785s] [ 55%] 2024-08-08T21:29:16.1264736Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32 PASSED [0.3723s] [ 55%] 2024-08-08T21:29:16.1266077Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8 PASSED [1.0432s] [ 55%] 2024-08-08T21:29:16.1267339Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int16 PASSED [0.3141s] [ 55%] 2024-08-08T21:29:16.1268539Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int8 PASSED [0.1307s] [ 55%] 2024-08-08T21:29:16.1269801Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float32 PASSED [0.0067s] [ 55%] 2024-08-08T21:29:16.1271109Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int64 PASSED [0.1837s] [ 56%] 2024-08-08T21:29:16.1272598Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float32 PASSED [0.0066s] [ 56%] 2024-08-08T21:29:16.1273918Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int32 PASSED [0.2456s] [ 56%] 2024-08-08T21:29:16.1275296Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int8 PASSED [0.2346s] [ 56%] 2024-08-08T21:29:16.1276528Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int32 PASSED [0.2069s] [ 56%] 2024-08-08T21:29:16.1277838Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bool PASSED [0.0094s] [ 56%] 2024-08-08T21:29:16.1279228Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_bool PASSED [0.2306s] [ 56%] 2024-08-08T21:29:16.1280602Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int16 PASSED [0.0059s] [ 57%] 2024-08-08T21:29:16.1281954Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int32 PASSED [0.0055s] [ 57%] 2024-08-08T21:29:16.1283313Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int8 PASSED [0.0055s] [ 57%] 2024-08-08T21:29:16.1284707Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int8 PASSED [0.2338s] [ 57%] 2024-08-08T21:29:16.1286394Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32 SKIPPED [0.0002s] (Skipping - testing takes an unreasonably long time, #79528) [ 57%] 2024-08-08T21:29:16.1288392Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float64 SKIPPED [0.0002s] (Skipping - testing takes an unreasonably long time, #79528) [ 57%] 2024-08-08T21:29:16.1290384Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64 SKIPPED [0.0001s] (Skipping - testing takes an unreasonably long time, #79528) [ 57%] 2024-08-08T21:29:16.1292008Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int8 PASSED [0.1935s] [ 58%] 2024-08-08T21:29:16.1293279Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_float32 PASSED [5.5885s] [ 58%] 2024-08-08T21:29:16.1294496Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_uint8 PASSED [2.0010s] [ 58%] 2024-08-08T21:29:16.1295672Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_bfloat16 PASSED [0.0134s] [ 58%] 2024-08-08T21:29:16.1296808Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float64 PASSED [0.1923s] [ 58%] 2024-08-08T21:29:16.1297941Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int64 PASSED [0.0849s] [ 58%] 2024-08-08T21:29:16.1299130Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_float16 PASSED [0.0266s] [ 58%] 2024-08-08T21:29:16.1300396Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex128 PASSED [0.5312s] [ 59%] 2024-08-08T21:29:16.1301650Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex32 PASSED [0.3144s] [ 59%] 2024-08-08T21:29:16.1302884Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_float32 PASSED [0.3068s] [ 59%] 2024-08-08T21:29:16.1304097Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_int8 PASSED [0.1247s] [ 59%] 2024-08-08T21:29:16.1305296Z test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_bool PASSED [0.0280s] [ 59%] 2024-08-08T21:29:16.1306539Z test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int8 PASSED [0.0282s] [ 59%] 2024-08-08T21:29:16.1307665Z test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_bfloat16 PASSED [0.0179s] [ 59%] 2024-08-08T21:29:16.1308953Z test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int64 PASSED [0.0291s] [ 60%] 2024-08-08T21:29:16.1310144Z test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float32 PASSED [2.8091s] [ 60%] 2024-08-08T21:29:16.1311268Z test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_int16 PASSED [0.0525s] [ 60%] 2024-08-08T21:29:16.1312522Z test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float64 PASSED [0.1470s] [ 60%] 2024-08-08T21:29:16.1313691Z test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int16 PASSED [0.0642s] [ 60%] 2024-08-08T21:29:16.1314896Z test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_int32 PASSED [0.0171s] [ 60%] 2024-08-08T21:29:16.1316398Z test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_uint8 PASSED [0.0122s] [ 60%] 2024-08-08T21:29:16.1317554Z test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int64 PASSED [0.0236s] [ 61%] 2024-08-08T21:29:16.1318754Z test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float64 PASSED [8.0951s] [ 61%] 2024-08-08T21:29:16.1319934Z test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int64 PASSED [0.4499s] [ 61%] 2024-08-08T21:29:16.1321084Z test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex64 PASSED [6.4008s] [ 61%] 2024-08-08T21:29:16.1322192Z test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int64 PASSED [6.5187s] [ 61%] 2024-08-08T21:29:16.1323455Z test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float64 PASSED [0.7075s] [ 61%] 2024-08-08T21:29:16.1324698Z test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_int8 PASSED [0.0100s] [ 61%] 2024-08-08T21:29:16.1325842Z test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float32 PASSED [0.2102s] [ 62%] 2024-08-08T21:29:16.1326978Z test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int32 PASSED [0.1271s] [ 62%] 2024-08-08T21:29:16.1328094Z test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int8 PASSED [0.1241s] [ 62%] 2024-08-08T21:29:16.1329189Z test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_bool PASSED [0.3644s] [ 62%] 2024-08-08T21:29:16.1330330Z test_decomp.py::TestDecompCUDA::test_comprehensive_tril_indices_cuda_int32 PASSED [2.3404s] [ 62%] 2024-08-08T21:29:16.1331503Z test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_complex128 PASSED [1.8384s] [ 62%] 2024-08-08T21:29:16.1332624Z test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int32 PASSED [0.5294s] [ 62%] 2024-08-08T21:29:16.1333781Z test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex32 PASSED [1.0266s] [ 63%] 2024-08-08T21:29:16.1334947Z test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int32 PASSED [0.1767s] [ 63%] 2024-08-08T21:29:16.1336084Z test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int64 PASSED [0.0344s] [ 63%] 2024-08-08T21:29:16.1337251Z test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_bfloat16 PASSED [0.2150s] [ 63%] 2024-08-08T21:29:16.1338445Z test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_int64 PASSED [3.3622s] [ 63%] 2024-08-08T21:29:16.1339777Z test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float64 PASSED [5.0952s] [ 63%] 2024-08-08T21:29:16.1340903Z test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int64 PASSED [0.5724s] [ 63%] 2024-08-08T21:29:16.1342011Z test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int8 PASSED [0.4428s] [ 64%] 2024-08-08T21:29:16.1343284Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int64 PASSED [0.3580s] [ 64%] 2024-08-08T21:29:16.1344461Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bool PASSED [0.1243s] [ 64%] 2024-08-08T21:29:16.1345658Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_complex64 PASSED [0.4137s] [ 64%] 2024-08-08T21:29:16.1346864Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float64 PASSED [0.3283s] [ 64%] 2024-08-08T21:29:16.1348049Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_int16 PASSED [0.0378s] [ 64%] 2024-08-08T21:29:16.1349195Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_uint8 PASSED [0.0376s] [ 64%] 2024-08-08T21:29:16.1350329Z test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_complex128 PASSED [1.1963s] [ 65%] 2024-08-08T21:29:16.1351462Z test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_float64 PASSED [0.7514s] [ 65%] 2024-08-08T21:29:16.1352707Z test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_complex_cuda_float64 PASSED [0.0056s] [ 65%] 2024-08-08T21:29:16.1353888Z test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float32 PASSED [0.0757s] [ 65%] 2024-08-08T21:29:16.1355022Z test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_uint8 PASSED [0.0516s] [ 65%] 2024-08-08T21:29:16.1356140Z test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_int64 PASSED [0.0258s] [ 65%] 2024-08-08T21:29:16.1357489Z test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_bfloat16 SKIPPED [0.0023s] (zero_ in torch.bfloat16 not supported) [ 65%] 2024-08-08T21:29:16.1358993Z test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int16 SKIPPED [0.0022s] (zero_ in torch.int16 not supported) [ 66%] 2024-08-08T21:29:16.1360257Z test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_uint8 PASSED [0.0104s] [ 66%] 2024-08-08T21:29:16.1361422Z test_decomp.py::TestDecompCUDA::test_quick__batch_norm_with_update_cuda_float16 PASSED [0.0340s] [ 66%] 2024-08-08T21:29:16.1362645Z test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_float32 PASSED [0.4275s] [ 66%] 2024-08-08T21:29:16.1363838Z test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_bool PASSED [0.1938s] [ 66%] 2024-08-08T21:29:16.1365004Z test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_int32 PASSED [0.1941s] [ 66%] 2024-08-08T21:29:16.1366167Z test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_uint8 PASSED [0.1942s] [ 67%] 2024-08-08T21:29:16.1367272Z test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int64 PASSED [0.0275s] [ 67%] 2024-08-08T21:29:16.1368281Z test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_uint8 PASSED [0.0272s] [ 67%] 2024-08-08T21:29:16.1369305Z test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_bool PASSED [0.0093s] [ 67%] 2024-08-08T21:29:16.1370334Z test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_uint8 PASSED [0.0093s] [ 67%] 2024-08-08T21:29:16.1371384Z test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_bfloat16 PASSED [0.0075s] [ 67%] 2024-08-08T21:29:16.1372457Z test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_complex128 PASSED [0.0162s] [ 67%] 2024-08-08T21:29:16.1373735Z test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float64 PASSED [0.1557s] [ 68%] 2024-08-08T21:29:16.1374854Z test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex64 PASSED [0.0943s] [ 68%] 2024-08-08T21:29:16.1376049Z test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_bfloat16 PASSED [0.0215s] [ 68%] 2024-08-08T21:29:16.1377153Z test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_int32 PASSED [0.0279s] [ 68%] 2024-08-08T21:29:16.1378681Z test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int16 PASSED [0.0071s] [ 68%] 2024-08-08T21:29:16.1380059Z test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int64 PASSED [0.0069s] [ 68%] 2024-08-08T21:29:16.1381686Z test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bool PASSED [0.0426s] [ 68%] 2024-08-08T21:29:16.1383356Z test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex64 PASSED [0.1150s] [ 69%] 2024-08-08T21:29:16.1385171Z test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float16 PASSED [0.0284s] [ 69%] 2024-08-08T21:29:16.1386904Z test_decomp.py::TestDecompCUDA::test_quick_any_cuda_bool PASSED [0.0435s] [ 69%] 2024-08-08T21:29:16.1388578Z test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_float64 PASSED [0.0342s] [ 69%] 2024-08-08T21:29:16.1390404Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_bool PASSED [0.0138s] [ 69%] 2024-08-08T21:29:16.1392588Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int8 PASSED [0.0136s] [ 69%] 2024-08-08T21:29:16.1394416Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_bool PASSED [0.0228s] [ 69%] 2024-08-08T21:29:16.1396310Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int16 PASSED [0.0227s] [ 70%] 2024-08-08T21:29:16.1398130Z test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bfloat16 PASSED [0.0226s] [ 70%] 2024-08-08T21:29:16.1399832Z test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bool PASSED [0.1019s] [ 70%] 2024-08-08T21:29:16.1401498Z test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_int16 PASSED [0.1010s] [ 70%] 2024-08-08T21:29:16.1403134Z test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex128 PASSED [0.5829s] [ 70%] 2024-08-08T21:29:16.1405014Z test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_bool PASSED [0.0966s] [ 70%] 2024-08-08T21:29:16.1406736Z test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int64 PASSED [0.0187s] [ 70%] 2024-08-08T21:29:16.1408558Z test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int8 PASSED [0.0189s] [ 71%] 2024-08-08T21:29:16.1410421Z test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_uint8 PASSED [0.0183s] [ 71%] 2024-08-08T21:29:16.1412203Z test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_int8 PASSED [0.1155s] [ 71%] 2024-08-08T21:29:16.1413983Z test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_bfloat16 XFAIL [0.0070s] [ 71%] 2024-08-08T21:29:16.1416117Z test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_float32 PASSED [0.3554s] [ 71%] 2024-08-08T21:29:16.1417902Z test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float32 PASSED [0.1417s] [ 71%] 2024-08-08T21:29:16.1419721Z test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float64 PASSED [0.1328s] [ 71%] 2024-08-08T21:29:16.1421617Z test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int64 PASSED [0.0926s] [ 72%] 2024-08-08T21:29:16.1423459Z test_decomp.py::TestDecompCUDA::test_quick_complex_cuda_float64 PASSED [0.1403s] [ 72%] 2024-08-08T21:29:16.1426284Z test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_uint8 SKIPPED [0.0043s] (only backwards is decomposed, but dtype doesn't support AD) [ 72%] 2024-08-08T21:29:16.1428757Z test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_float16 PASSED [0.0588s] [ 72%] 2024-08-08T21:29:16.1431209Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_glu_cuda_float64 SKIPPED [0.0002s] (Skipped!) [ 72%] 2024-08-08T21:29:16.1433580Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_sinc_cuda_float64 PASSED [4.2815s] [ 72%] 2024-08-08T21:29:16.1435700Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_t_copy_cuda_float64 PASSED [0.4624s] [ 72%] 2024-08-08T21:29:16.1437762Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_vdot_cuda_float64 PASSED [0.5468s] [ 73%] 2024-08-08T21:29:16.1439811Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_zero__cuda_float64 XFAIL [0.1002s] [ 73%] 2024-08-08T21:29:16.1441760Z test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float16 PASSED [0.0074s] [ 73%] 2024-08-08T21:29:16.1443548Z test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_int32 PASSED [0.0097s] [ 73%] 2024-08-08T21:29:16.1445431Z test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_bfloat16 PASSED [0.0072s] [ 73%] 2024-08-08T21:29:16.1447252Z test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_uint8 PASSED [0.0093s] [ 73%] 2024-08-08T21:29:16.1449064Z test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_bool PASSED [0.0448s] [ 73%] 2024-08-08T21:29:16.1450683Z test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float16 PASSED [0.0576s] [ 74%] 2024-08-08T21:29:16.1452432Z test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float32 PASSED [0.0837s] [ 74%] 2024-08-08T21:29:16.1454196Z test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_float64 PASSED [0.1119s] [ 74%] 2024-08-08T21:29:16.1455846Z test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_int8 PASSED [0.0828s] [ 74%] 2024-08-08T21:29:16.1457582Z test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_bfloat16 PASSED [0.0055s] [ 74%] 2024-08-08T21:29:16.1459598Z test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_uint8 SKIPPED [0.0023s] (diag in torch.uint8 not supported) [ 74%] 2024-08-08T21:29:16.1461711Z test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float32 PASSED [0.1043s] [ 74%] 2024-08-08T21:29:16.1463532Z test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_float32 PASSED [0.1634s] [ 75%] 2024-08-08T21:29:16.1465405Z test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_uint8 PASSED [0.0648s] [ 75%] 2024-08-08T21:29:16.1467174Z test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_bool PASSED [0.0094s] [ 75%] 2024-08-08T21:29:16.1468970Z test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int64 PASSED [0.0092s] [ 75%] 2024-08-08T21:29:16.1470704Z test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float16 PASSED [0.0626s] [ 75%] 2024-08-08T21:29:16.1472508Z test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_float64 PASSED [0.0099s] [ 75%] 2024-08-08T21:29:16.1474495Z test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int16 SKIPPED [0.0023s] (empty_like in torch.int16 not supported) [ 75%] 2024-08-08T21:29:16.1476607Z test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int8 SKIPPED [0.0025s] (empty_like in torch.int8 not supported) [ 76%] 2024-08-08T21:29:16.1478553Z test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_complex32 PASSED [0.2587s] [ 76%] 2024-08-08T21:29:16.1480541Z test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_float16 PASSED [0.0877s] [ 76%] 2024-08-08T21:29:16.1482207Z test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int16 PASSED [0.0996s] [ 76%] 2024-08-08T21:29:16.1483849Z test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int8 PASSED [0.0098s] [ 76%] 2024-08-08T21:29:16.1485722Z test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int16 PASSED [0.0298s] [ 76%] 2024-08-08T21:29:16.1487566Z test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int8 PASSED [0.0275s] [ 76%] 2024-08-08T21:29:16.1489345Z test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_float32 PASSED [0.0146s] [ 77%] 2024-08-08T21:29:16.1491143Z test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_complex64 PASSED [0.0698s] [ 77%] 2024-08-08T21:29:16.1492965Z test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_int32 PASSED [0.0231s] [ 77%] 2024-08-08T21:29:16.1494699Z test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float16 PASSED [0.0139s] [ 77%] 2024-08-08T21:29:16.1496395Z test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_complex128 PASSED [0.0639s] [ 77%] 2024-08-08T21:29:16.1498107Z test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int64 PASSED [0.0277s] [ 77%] 2024-08-08T21:29:16.1499888Z test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float32 PASSED [0.1122s] [ 77%] 2024-08-08T21:29:16.1501513Z test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int64 PASSED [0.0866s] [ 78%] 2024-08-08T21:29:16.1503184Z test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex64 PASSED [0.1027s] [ 78%] 2024-08-08T21:29:16.1505398Z test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int64 SKIPPED [0.0043s] (only backwards is decomposed, but dtype doesn't support AD) [ 78%] 2024-08-08T21:29:16.1507638Z test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_complex64 PASSED [0.1023s] [ 78%] 2024-08-08T21:29:16.1509491Z test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float16 PASSED [0.1310s] [ 78%] 2024-08-08T21:29:16.1511936Z test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_uint8 SKIPPED [0.0044s] (only backwards is decomposed, but dtype doesn't support AD) [ 78%] 2024-08-08T21:29:16.1514645Z test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_uint8 SKIPPED [0.0043s] (only backwards is decomposed, but dtype doesn't support AD) [ 78%] 2024-08-08T21:29:16.1516998Z test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_float32 PASSED [0.0301s] [ 79%] 2024-08-08T21:29:16.1519290Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int16 SKIPPED [0.0043s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.1521495Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float16 PASSED [0.0130s] [ 79%] 2024-08-08T21:29:16.1523388Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float32 PASSED [0.0297s] [ 79%] 2024-08-08T21:29:16.1525722Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int8 SKIPPED [0.0043s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.1528599Z test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int16 SKIPPED [0.0044s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.1531449Z test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_uint8 SKIPPED [0.0042s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.1533803Z test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float16 PASSED [0.0148s] [ 80%] 2024-08-08T21:29:16.1535639Z test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float32 PASSED [0.0298s] [ 80%] 2024-08-08T21:29:16.1537784Z test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float64 PASSED [0.0955s] [ 80%] 2024-08-08T21:29:16.1539532Z test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int16 PASSED [0.0088s] [ 80%] 2024-08-08T21:29:16.1541519Z test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_float16 PASSED [0.0055s] [ 80%] 2024-08-08T21:29:16.1543293Z test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int16 PASSED [0.0272s] [ 80%] 2024-08-08T21:29:16.1544998Z test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int16 PASSED [0.0950s] [ 80%] 2024-08-08T21:29:16.1546706Z test_decomp.py::TestDecompCUDA::test_quick_frexp_cuda_float64 PASSED [0.0223s] [ 81%] 2024-08-08T21:29:16.1548392Z test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int16 PASSED [0.4617s] [ 81%] 2024-08-08T21:29:16.1550176Z test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float64 XFAIL [0.0062s] [ 81%] 2024-08-08T21:29:16.1552018Z test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_int32 PASSED [0.0101s] [ 81%] 2024-08-08T21:29:16.1553774Z test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float16 PASSED [0.0190s] [ 81%] 2024-08-08T21:29:16.1555660Z test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex128 PASSED [0.0461s] [ 81%] 2024-08-08T21:29:16.1557412Z test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int8 PASSED [0.0200s] [ 81%] 2024-08-08T21:29:16.1559092Z test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float32 PASSED [0.0365s] [ 82%] 2024-08-08T21:29:16.1560811Z test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float64 PASSED [0.0354s] [ 82%] 2024-08-08T21:29:16.1562646Z test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float64 PASSED [0.0358s] [ 82%] 2024-08-08T21:29:16.1564472Z test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float32 PASSED [0.1444s] [ 82%] 2024-08-08T21:29:16.1566186Z test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int64 PASSED [0.1964s] [ 82%] 2024-08-08T21:29:16.1568113Z test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_complex64 PASSED [1.2987s] [ 82%] 2024-08-08T21:29:16.1570053Z test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float16 PASSED [0.0575s] [ 82%] 2024-08-08T21:29:16.1571753Z test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_bool PASSED [0.0103s] [ 83%] 2024-08-08T21:29:16.1573469Z test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_float16 PASSED [0.0075s] [ 83%] 2024-08-08T21:29:16.1575267Z test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_bool PASSED [0.0098s] [ 83%] 2024-08-08T21:29:16.1576977Z test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex64 PASSED [0.1739s] [ 83%] 2024-08-08T21:29:16.1578664Z test_decomp.py::TestDecompCUDA::test_quick_log_cuda_bfloat16 PASSED [0.0075s] [ 83%] 2024-08-08T21:29:16.1580462Z test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex64 PASSED [0.0177s] [ 83%] 2024-08-08T21:29:16.1582246Z test_decomp.py::TestDecompCUDA::test_quick_log_cuda_uint8 PASSED [0.0095s] [ 83%] 2024-08-08T21:29:16.1584081Z test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex128 PASSED [0.9212s] [ 84%] 2024-08-08T21:29:16.1585988Z test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int64 PASSED [0.0954s] [ 84%] 2024-08-08T21:29:16.1587897Z test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_complex64 PASSED [0.0192s] [ 84%] 2024-08-08T21:29:16.1589796Z test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int32 PASSED [0.0969s] [ 84%] 2024-08-08T21:29:16.1591993Z test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex64 PASSED [0.4923s] [ 84%] 2024-08-08T21:29:16.1593978Z test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_uint8 PASSED [0.7979s] [ 84%] 2024-08-08T21:29:16.1595961Z test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_float32 PASSED [0.0877s] [ 84%] 2024-08-08T21:29:16.1597944Z test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int16 PASSED [0.0952s] [ 85%] 2024-08-08T21:29:16.1599726Z test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float16 PASSED [0.0156s] [ 85%] 2024-08-08T21:29:16.1601486Z test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int32 PASSED [0.0953s] [ 85%] 2024-08-08T21:29:16.1603154Z test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int16 PASSED [0.0944s] [ 85%] 2024-08-08T21:29:16.1604823Z test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int64 PASSED [0.0952s] [ 85%] 2024-08-08T21:29:16.1606584Z test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_bfloat16 PASSED [0.0060s] [ 85%] 2024-08-08T21:29:16.1608375Z test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float16 PASSED [0.0078s] [ 85%] 2024-08-08T21:29:16.1610777Z test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_float16 SKIPPED [0.0025s] (native_batch_norm in torch.float16 not supported) [ 86%] 2024-08-08T21:29:16.1613078Z test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float64 PASSED [0.3374s] [ 86%] 2024-08-08T21:29:16.1615020Z test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bool PASSED [0.0173s] [ 86%] 2024-08-08T21:29:16.1617095Z test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_float32 PASSED [0.0288s] [ 86%] 2024-08-08T21:29:16.1618877Z test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int32 PASSED [0.0172s] [ 86%] 2024-08-08T21:29:16.1620691Z test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int64 PASSED [0.0165s] [ 86%] 2024-08-08T21:29:16.1622553Z test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_float32 PASSED [0.1502s] [ 86%] 2024-08-08T21:29:16.1624510Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float32 PASSED [0.0624s] [ 87%] 2024-08-08T21:29:16.1627186Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardshrink_cuda_float16 SKIPPED [0.0024s] (nn.functional.hardshrink in torch.float16 not supported) [ 87%] 2024-08-08T21:29:16.1629701Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float16 PASSED [0.0146s] [ 87%] 2024-08-08T21:29:16.1631909Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_logsigmoid_cuda_bfloat16 PASSED [0.0113s] [ 87%] 2024-08-08T21:29:16.1634005Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_bool PASSED [0.2705s] [ 87%] 2024-08-08T21:29:16.1636132Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float64 PASSED [0.3862s] [ 87%] 2024-08-08T21:29:16.1638252Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int16 PASSED [0.2678s] [ 87%] 2024-08-08T21:29:16.1640250Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_int64 PASSED [0.0292s] [ 88%] 2024-08-08T21:29:16.1642715Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_float64 SKIPPED [0.0023s] (nn.functional.rrelu in torch.float64 not supported) [ 88%] 2024-08-08T21:29:16.1645181Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex128 PASSED [2.4733s] [ 88%] 2024-08-08T21:29:16.1647513Z test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_complex128 SKIPPED [0.0024s] (norm in torch.complex128 not supported) [ 88%] 2024-08-08T21:29:16.1650266Z test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float32 SKIPPED [0.0023s] (norm in torch.float32 not supported) [ 88%] 2024-08-08T21:29:16.1652444Z test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float32 XFAIL [0.0065s] [ 88%] 2024-08-08T21:29:16.1654629Z test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float64 XFAIL [0.0061s] [ 88%] 2024-08-08T21:29:16.1656483Z test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int32 PASSED [0.0061s] [ 89%] 2024-08-08T21:29:16.1658227Z test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int8 PASSED [0.0058s] [ 89%] 2024-08-08T21:29:16.1659985Z test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_uint8 PASSED [0.0170s] [ 89%] 2024-08-08T21:29:16.1661801Z test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex64 PASSED [0.5623s] [ 89%] 2024-08-08T21:29:16.1663599Z test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_int8 PASSED [0.0276s] [ 89%] 2024-08-08T21:29:16.1665392Z test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float64 XFAIL [0.0061s] [ 89%] 2024-08-08T21:29:16.1667235Z test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_bool PASSED [0.0099s] [ 89%] 2024-08-08T21:29:16.1669162Z test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_complex64 PASSED [0.0177s] [ 90%] 2024-08-08T21:29:16.1671053Z test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float32 PASSED [0.0124s] [ 90%] 2024-08-08T21:29:16.1673035Z test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_bfloat16 PASSED [0.0167s] [ 90%] 2024-08-08T21:29:16.1674916Z test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float64 PASSED [0.1433s] [ 90%] 2024-08-08T21:29:16.1676775Z test_decomp.py::TestDecompCUDA::test_quick_renorm_cuda_float64 PASSED [0.0632s] [ 90%] 2024-08-08T21:29:16.1678579Z test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_int64 PASSED [0.0536s] [ 90%] 2024-08-08T21:29:16.1680279Z test_decomp.py::TestDecompCUDA::test_quick_round_cuda_int16 PASSED [0.0273s] [ 90%] 2024-08-08T21:29:16.1682144Z test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_float16 PASSED [0.0075s] [ 91%] 2024-08-08T21:29:16.1684158Z test_decomp.py::TestDecompCUDA::test_quick_round_decimals_neg_3_cuda_bfloat16 PASSED [0.0082s] [ 91%] 2024-08-08T21:29:16.1686069Z test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_bfloat16 PASSED [0.0147s] [ 91%] 2024-08-08T21:29:16.1687859Z test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float16 PASSED [0.0129s] [ 91%] 2024-08-08T21:29:16.1689734Z test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int16 PASSED [0.0563s] [ 91%] 2024-08-08T21:29:16.1691570Z test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_int16 PASSED [0.0272s] [ 91%] 2024-08-08T21:29:16.1693363Z test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_float16 PASSED [0.0120s] [ 91%] 2024-08-08T21:29:16.1695197Z test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float32 PASSED [0.0359s] [ 92%] 2024-08-08T21:29:16.1696978Z test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_uint8 PASSED [0.0269s] [ 92%] 2024-08-08T21:29:16.1698784Z test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex128 PASSED [0.0631s] [ 92%] 2024-08-08T21:29:16.1700564Z test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex32 PASSED [0.6251s] [ 92%] 2024-08-08T21:29:16.1702370Z test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_float32 PASSED [0.0370s] [ 92%] 2024-08-08T21:29:16.1704118Z test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_float64 PASSED [0.0148s] [ 92%] 2024-08-08T21:29:16.1706137Z test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int32 PASSED [0.0281s] [ 92%] 2024-08-08T21:29:16.1707911Z test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_float64 PASSED [0.0998s] [ 93%] 2024-08-08T21:29:16.1709827Z test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_int64 PASSED [0.0795s] [ 93%] 2024-08-08T21:29:16.1711769Z test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_float64 PASSED [0.5255s] [ 93%] 2024-08-08T21:29:16.1713561Z test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int64 PASSED [0.4282s] [ 93%] 2024-08-08T21:29:16.1715620Z test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_uint8 PASSED [0.0100s] [ 93%] 2024-08-08T21:29:16.1717482Z test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int64 PASSED [0.0987s] [ 93%] 2024-08-08T21:29:16.1719353Z test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int8 PASSED [0.0988s] [ 93%] 2024-08-08T21:29:16.1721089Z test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_uint8 PASSED [2.1959s] [ 94%] 2024-08-08T21:29:16.1722840Z test_decomp.py::TestDecompCUDA::test_quick_split_cuda_int8 PASSED [0.0438s] [ 94%] 2024-08-08T21:29:16.1724703Z test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_complex128 PASSED [0.2913s] [ 94%] 2024-08-08T21:29:16.1726729Z test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int64 PASSED [0.1230s] [ 94%] 2024-08-08T21:29:16.1728671Z test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_bfloat16 PASSED [0.0056s] [ 94%] 2024-08-08T21:29:16.1730444Z test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_uint8 PASSED [0.0278s] [ 94%] 2024-08-08T21:29:16.1732168Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_float16 PASSED [0.0115s] [ 94%] 2024-08-08T21:29:16.1733879Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_uint8 PASSED [0.0351s] [ 95%] 2024-08-08T21:29:16.1735745Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int8 PASSED [0.0286s] [ 95%] 2024-08-08T21:29:16.1737497Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_uint8 PASSED [0.0285s] [ 95%] 2024-08-08T21:29:16.1739208Z test_decomp.py::TestDecompCUDA::test_quick_std_cuda_complex128 PASSED [0.2369s] [ 95%] 2024-08-08T21:29:16.1740896Z test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_float32 PASSED [0.0890s] [ 95%] 2024-08-08T21:29:16.1742643Z test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_float64 PASSED [0.0140s] [ 95%] 2024-08-08T21:29:16.1744479Z test_decomp.py::TestDecompCUDA::test_quick_t_cuda_complex128 PASSED [0.0202s] [ 95%] 2024-08-08T21:29:16.1746244Z test_decomp.py::TestDecompCUDA::test_quick_t_cuda_float16 PASSED [0.0070s] [ 96%] 2024-08-08T21:29:16.1747809Z test_decomp.py::TestDecompCUDA::test_quick_take_cuda_bool PASSED [0.0209s] [ 96%] 2024-08-08T21:29:16.1749232Z test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_bfloat16 PASSED [0.0066s] [ 96%] 2024-08-08T21:29:16.1750677Z test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int64 PASSED [0.0280s] [ 96%] 2024-08-08T21:29:16.1752304Z test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float32 PASSED [0.0095s] [ 96%] 2024-08-08T21:29:16.1753804Z test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int32 PASSED [0.0109s] [ 96%] 2024-08-08T21:29:16.1755433Z test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int64 PASSED [0.0117s] [ 96%] 2024-08-08T21:29:16.1757017Z test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_int16 PASSED [0.0930s] [ 97%] 2024-08-08T21:29:16.1758867Z test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float16 PASSED [0.0058s] [ 97%] 2024-08-08T21:29:16.1760468Z test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int16 PASSED [0.0272s] [ 97%] 2024-08-08T21:29:16.1762153Z test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex32 PASSED [0.4639s] [ 97%] 2024-08-08T21:29:16.1763829Z test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int8 PASSED [0.6966s] [ 97%] 2024-08-08T21:29:16.1765470Z test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_complex32 PASSED [1.6478s] [ 97%] 2024-08-08T21:29:16.1767183Z test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int32 PASSED [0.6901s] [ 97%] 2024-08-08T21:29:16.1768983Z test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_bool PASSED [0.0476s] [ 98%] 2024-08-08T21:29:16.1770868Z test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex32 PASSED [0.1029s] [ 98%] 2024-08-08T21:29:16.1772814Z test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int64 PASSED [0.0374s] [ 98%] 2024-08-08T21:29:16.1774697Z test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_complex64 PASSED [0.0313s] [ 98%] 2024-08-08T21:29:16.1776570Z test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_bool PASSED [0.0584s] [ 98%] 2024-08-08T21:29:16.1778396Z test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int64 PASSED [0.0580s] [ 98%] 2024-08-08T21:29:16.1780176Z test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int16 PASSED [0.0993s] [ 98%] 2024-08-08T21:29:16.1782277Z test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_complex64 SKIPPED [0.0024s] (zero_ in torch.complex64 not supported) [ 99%] 2024-08-08T21:29:16.1784648Z test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_uint8 SKIPPED [0.0023s] (zero_ in torch.uint8 not supported) [ 99%] 2024-08-08T21:29:16.1786764Z test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex128 PASSED [0.0067s] [ 99%] 2024-08-08T21:29:16.1788587Z test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int64 PASSED [0.0062s] [ 99%] 2024-08-08T21:29:16.1790574Z test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_GRU_train_mode_cuda_float64 PASSED [25.8615s] [ 99%] 2024-08-08T21:29:16.1792894Z test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float32 PASSED [42.9641s] [ 99%] 2024-08-08T21:29:16.1795093Z test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_eval_mode_cuda_float64 PASSED [21.7714s] [100%] 2024-08-08T21:29:16.1796212Z 2024-08-08T21:29:16.1797420Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_decomp/test_decomp-dd9d012de50495e6.xml - 2024-08-08T21:29:16.1799644Z ============ 648 passed, 43 skipped, 6 xfailed in 575.74s (0:09:35) ============ 2024-08-08T21:29:16.1800930Z Got exit code -11 (SIGSEGV) 2024-08-08T21:29:16.1801492Z Retrying single test... 2024-08-08T21:29:16.1802729Z Test results will be stored in test-reports/python-pytest/test_decomp/test_decomp-2330f5b1bb70a582.xml 2024-08-08T21:29:16.1804326Z ============================= test session starts ============================== 2024-08-08T21:29:16.1805790Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T21:29:16.1806909Z cachedir: .pytest_cache 2024-08-08T21:29:16.1808363Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T21:29:16.1809716Z rootdir: /var/lib/jenkins/workspace 2024-08-08T21:29:16.1810356Z configfile: pytest.ini 2024-08-08T21:29:16.1811616Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T21:29:16.1813174Z collecting ... collected 8818 items 2024-08-08T21:29:16.1813992Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T21:29:16.1814788Z Running 697 items in this shard 2024-08-08T21:29:16.1815985Z 2024-08-08T21:29:16.1816895Z test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex32 PASSED [0.7630s] [ 0%] 2024-08-08T21:29:16.1818115Z test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_int8 PASSED [0.2469s] [ 0%] 2024-08-08T21:29:16.1819269Z test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_bool PASSED [0.6465s] [ 0%] 2024-08-08T21:29:16.1820420Z test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_float32 PASSED [1.5917s] [ 0%] 2024-08-08T21:29:16.1821580Z test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float32 PASSED [4.0965s] [ 0%] 2024-08-08T21:29:16.1822756Z test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int16 PASSED [0.1417s] [ 0%] 2024-08-08T21:29:16.1823886Z test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int32 PASSED [0.1275s] [ 1%] 2024-08-08T21:29:16.1825021Z test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_uint8 PASSED [0.1469s] [ 1%] 2024-08-08T21:29:16.1826134Z test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int32 PASSED [0.2138s] [ 1%] 2024-08-08T21:29:16.1827297Z test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_float16 PASSED [0.0500s] [ 1%] 2024-08-08T21:29:16.1828464Z test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_int8 PASSED [0.4767s] [ 1%] 2024-08-08T21:29:16.1829709Z test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_bfloat16 PASSED [0.1247s] [ 1%] 2024-08-08T21:29:16.1830988Z test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_bool PASSED [1.9483s] [ 1%] 2024-08-08T21:29:16.1832411Z test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int64 PASSED [2.0386s] [ 2%] 2024-08-08T21:29:16.1833686Z test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex128 PASSED [0.1711s] [ 2%] 2024-08-08T21:29:16.1834842Z test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex64 PASSED [0.1711s] [ 2%] 2024-08-08T21:29:16.1835978Z test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float32 PASSED [0.0858s] [ 2%] 2024-08-08T21:29:16.1837128Z test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex128 PASSED [0.1684s] [ 2%] 2024-08-08T21:29:16.1838260Z test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_uint8 PASSED [0.1372s] [ 2%] 2024-08-08T21:29:16.1839391Z test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_complex128 PASSED [0.8489s] [ 2%] 2024-08-08T21:29:16.1840603Z test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float16 PASSED [0.2249s] [ 3%] 2024-08-08T21:29:16.1841794Z test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_bfloat16 PASSED [0.2586s] [ 3%] 2024-08-08T21:29:16.1842964Z test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_complex64 PASSED [0.0278s] [ 3%] 2024-08-08T21:29:16.1844117Z test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int32 PASSED [0.0472s] [ 3%] 2024-08-08T21:29:16.1845231Z test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_int8 PASSED [0.0471s] [ 3%] 2024-08-08T21:29:16.1846346Z test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_bool PASSED [0.0468s] [ 3%] 2024-08-08T21:29:16.1847590Z test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_uint8 PASSED [0.0472s] [ 3%] 2024-08-08T21:29:16.1848735Z test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float32 PASSED [0.2376s] [ 4%] 2024-08-08T21:29:16.1849894Z test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_float16 PASSED [0.0150s] [ 4%] 2024-08-08T21:29:16.1851126Z test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_int64 PASSED [0.0108s] [ 4%] 2024-08-08T21:29:16.1852258Z test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int32 PASSED [0.0494s] [ 4%] 2024-08-08T21:29:16.1853403Z test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_uint8 PASSED [0.0255s] [ 4%] 2024-08-08T21:29:16.1854547Z test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int8 PASSED [0.0092s] [ 4%] 2024-08-08T21:29:16.1855777Z test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_int8 PASSED [0.0056s] [ 4%] 2024-08-08T21:29:16.1857286Z test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_float32 SKIPPED [0.0002s] (Expected: new_empty_strided is not comparable) [ 5%] 2024-08-08T21:29:16.1858692Z test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_int64 PASSED [0.0339s] [ 5%] 2024-08-08T21:29:16.1859824Z test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_float32 PASSED [0.1995s] [ 5%] 2024-08-08T21:29:16.1860947Z test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_bool PASSED [0.0304s] [ 5%] 2024-08-08T21:29:16.1862106Z test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_bfloat16 PASSED [0.0119s] [ 5%] 2024-08-08T21:29:16.1863299Z test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float32 PASSED [0.0235s] [ 5%] 2024-08-08T21:29:16.1864534Z test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int16 PASSED [0.0188s] [ 5%] 2024-08-08T21:29:16.1865714Z test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_bfloat16 PASSED [0.0233s] [ 6%] 2024-08-08T21:29:16.1866903Z test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_bfloat16 PASSED [0.0094s] [ 6%] 2024-08-08T21:29:16.1868111Z test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_complex128 PASSED [0.8994s] [ 6%] 2024-08-08T21:29:16.1869288Z test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_float32 PASSED [0.0352s] [ 6%] 2024-08-08T21:29:16.1870470Z test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex128 PASSED [0.2235s] [ 6%] 2024-08-08T21:29:16.1871771Z test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int64 PASSED [0.0174s] [ 6%] 2024-08-08T21:29:16.1872953Z test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int16 PASSED [9.8980s] [ 6%] 2024-08-08T21:29:16.1874137Z test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_float16 PASSED [0.0366s] [ 7%] 2024-08-08T21:29:16.1875355Z test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int64 PASSED [0.0500s] [ 7%] 2024-08-08T21:29:16.1876554Z test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_complex32 PASSED [0.0689s] [ 7%] 2024-08-08T21:29:16.1877710Z test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int64 PASSED [0.0287s] [ 7%] 2024-08-08T21:29:16.1878810Z test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int8 PASSED [0.0283s] [ 7%] 2024-08-08T21:29:16.1879952Z test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex64 PASSED [0.0334s] [ 7%] 2024-08-08T21:29:16.1881201Z test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_uint8 PASSED [0.0427s] [ 7%] 2024-08-08T21:29:16.1882346Z test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_complex32 PASSED [0.0318s] [ 8%] 2024-08-08T21:29:16.1883492Z test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_complex128 PASSED [0.0454s] [ 8%] 2024-08-08T21:29:16.1884750Z test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int32 PASSED [0.0224s] [ 8%] 2024-08-08T21:29:16.1885904Z test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex128 PASSED [0.8214s] [ 8%] 2024-08-08T21:29:16.1887067Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float64 PASSED [2.2858s] [ 8%] 2024-08-08T21:29:16.1888229Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_int64 PASSED [0.7779s] [ 8%] 2024-08-08T21:29:16.1889394Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bool PASSED [0.7828s] [ 8%] 2024-08-08T21:29:16.1890558Z test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float16 PASSED [0.4574s] [ 9%] 2024-08-08T21:29:16.1891700Z test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int32 PASSED [0.0257s] [ 9%] 2024-08-08T21:29:16.1892877Z test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bfloat16 PASSED [0.0183s] [ 9%] 2024-08-08T21:29:16.1894101Z test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_complex128 PASSED [4.3563s] [ 9%] 2024-08-08T21:29:16.1895332Z test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_bfloat16 PASSED [0.0052s] [ 9%] 2024-08-08T21:29:16.1896555Z test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_bfloat16 PASSED [0.1587s] [ 9%] 2024-08-08T21:29:16.1897766Z test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int8 PASSED [0.0051s] [ 9%] 2024-08-08T21:29:16.1898926Z test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_complex32 PASSED [0.0258s] [ 10%] 2024-08-08T21:29:16.1900062Z test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_bfloat16 PASSED [1.6971s] [ 10%] 2024-08-08T21:29:16.1901191Z test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_int8 PASSED [3.1571s] [ 10%] 2024-08-08T21:29:16.1902318Z test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_float32 PASSED [0.0447s] [ 10%] 2024-08-08T21:29:16.1903473Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_bfloat16 PASSED [1.1293s] [ 10%] 2024-08-08T21:29:16.1904639Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_int32 PASSED [0.2890s] [ 10%] 2024-08-08T21:29:16.1905866Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int32 PASSED [0.4397s] [ 10%] 2024-08-08T21:29:16.1907131Z test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8 PASSED [0.4376s] [ 11%] 2024-08-08T21:29:16.1908346Z test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_float32 PASSED [0.1053s] [ 11%] 2024-08-08T21:29:16.1909549Z test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_complex64 PASSED [1.1791s] [ 11%] 2024-08-08T21:29:16.1910769Z test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_int64 PASSED [0.1504s] [ 11%] 2024-08-08T21:29:16.1911994Z test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int8 PASSED [2.2921s] [ 11%] 2024-08-08T21:29:16.1913112Z test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bool PASSED [0.0129s] [ 11%] 2024-08-08T21:29:16.1914325Z test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_bfloat16 PASSED [0.0520s] [ 11%] 2024-08-08T21:29:16.1916031Z test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_float16 PASSED [0.0510s] [ 12%] 2024-08-08T21:29:16.1917285Z test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int16 PASSED [0.1289s] [ 12%] 2024-08-08T21:29:16.1918610Z test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_float32 PASSED [0.0758s] [ 12%] 2024-08-08T21:29:16.1919760Z test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_int32 PASSED [0.0437s] [ 12%] 2024-08-08T21:29:16.1920913Z test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_bfloat16 PASSED [0.0283s] [ 12%] 2024-08-08T21:29:16.1922085Z test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex128 PASSED [0.3671s] [ 12%] 2024-08-08T21:29:16.1923414Z test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int64 SKIPPED [0.0024s] (empty in torch.int64 not supported) [ 12%] 2024-08-08T21:29:16.1924953Z test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int64 SKIPPED [0.0023s] (empty_like in torch.int64 not supported) [ 13%] 2024-08-08T21:29:16.1926558Z test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int32 SKIPPED [0.0022s] (empty_strided in torch.int32 not supported) [ 13%] 2024-08-08T21:29:16.1927908Z test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float32 PASSED [0.2137s] [ 13%] 2024-08-08T21:29:16.1929004Z test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int8 PASSED [0.1307s] [ 13%] 2024-08-08T21:29:16.1930105Z test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_int8 PASSED [0.0087s] [ 13%] 2024-08-08T21:29:16.1931241Z test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_bfloat16 PASSED [0.0167s] [ 13%] 2024-08-08T21:29:16.1932383Z test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_bfloat16 PASSED [0.0219s] [ 13%] 2024-08-08T21:29:16.1933512Z test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_uint8 PASSED [0.0118s] [ 14%] 2024-08-08T21:29:16.1934625Z test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int32 PASSED [0.0126s] [ 14%] 2024-08-08T21:29:16.1935741Z test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int32 PASSED [0.0099s] [ 14%] 2024-08-08T21:29:16.1936869Z test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int64 PASSED [0.0093s] [ 14%] 2024-08-08T21:29:16.1938027Z test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_bfloat16 PASSED [0.0276s] [ 14%] 2024-08-08T21:29:16.1939187Z test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_complex128 PASSED [0.3147s] [ 14%] 2024-08-08T21:29:16.1940347Z test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float32 PASSED [0.1364s] [ 14%] 2024-08-08T21:29:16.1941471Z test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_uint8 PASSED [0.0286s] [ 15%] 2024-08-08T21:29:16.1942646Z test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_float64 PASSED [0.0087s] [ 15%] 2024-08-08T21:29:16.1943803Z test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float64 PASSED [3.4992s] [ 15%] 2024-08-08T21:29:16.1944925Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int8 PASSED [0.4137s] [ 15%] 2024-08-08T21:29:16.1946117Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex64 PASSED [0.8099s] [ 15%] 2024-08-08T21:29:16.1947314Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float64 PASSED [2.4353s] [ 15%] 2024-08-08T21:29:16.1948608Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_bool PASSED [0.5164s] [ 15%] 2024-08-08T21:29:16.1949751Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_uint8 PASSED [0.6591s] [ 16%] 2024-08-08T21:29:16.1950901Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_bool PASSED [0.1533s] [ 16%] 2024-08-08T21:29:16.1952192Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_bool PASSED [0.2597s] [ 16%] 2024-08-08T21:29:16.1953367Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_complex64 PASSED [0.7606s] [ 16%] 2024-08-08T21:29:16.1954610Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int8 PASSED [0.1709s] [ 16%] 2024-08-08T21:29:16.1955794Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_float64 PASSED [0.9510s] [ 16%] 2024-08-08T21:29:16.1956958Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int64 PASSED [0.1593s] [ 16%] 2024-08-08T21:29:16.1958110Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_bool PASSED [0.2046s] [ 17%] 2024-08-08T21:29:16.1959253Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_int32 PASSED [0.3099s] [ 17%] 2024-08-08T21:29:16.1960399Z test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_bool PASSED [0.2482s] [ 17%] 2024-08-08T21:29:16.1961548Z test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float16 PASSED [0.0136s] [ 17%] 2024-08-08T21:29:16.1962669Z test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_bool PASSED [0.1087s] [ 17%] 2024-08-08T21:29:16.1963779Z test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int8 PASSED [0.1080s] [ 17%] 2024-08-08T21:29:16.1964947Z test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int64 PASSED [0.0258s] [ 17%] 2024-08-08T21:29:16.1966070Z test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_uint8 PASSED [0.0361s] [ 18%] 2024-08-08T21:29:16.1967234Z test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_float64 PASSED [2.9268s] [ 18%] 2024-08-08T21:29:16.1968389Z test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_uint8 PASSED [0.0283s] [ 18%] 2024-08-08T21:29:16.1969502Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int32 PASSED [0.1262s] [ 18%] 2024-08-08T21:29:16.1970627Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_bfloat16 PASSED [0.4078s] [ 18%] 2024-08-08T21:29:16.1971753Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int16 PASSED [0.1262s] [ 18%] 2024-08-08T21:29:16.1972860Z test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int8 PASSED [0.1270s] [ 18%] 2024-08-08T21:29:16.1973981Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bfloat16 PASSED [0.0104s] [ 19%] 2024-08-08T21:29:16.1975142Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bool PASSED [0.0155s] [ 19%] 2024-08-08T21:29:16.1976276Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_int8 PASSED [0.0084s] [ 19%] 2024-08-08T21:29:16.1977437Z test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_uint8 PASSED [0.0080s] [ 19%] 2024-08-08T21:29:16.1978583Z test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_bfloat16 PASSED [0.0210s] [ 19%] 2024-08-08T21:29:16.1979726Z test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_uint8 PASSED [0.0083s] [ 19%] 2024-08-08T21:29:16.1980834Z test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_int64 PASSED [0.1257s] [ 19%] 2024-08-08T21:29:16.1982014Z test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_uint8 PASSED [0.1255s] [ 20%] 2024-08-08T21:29:16.1983131Z test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int8 PASSED [0.0091s] [ 20%] 2024-08-08T21:29:16.1984288Z test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_complex64 PASSED [0.0706s] [ 20%] 2024-08-08T21:29:16.1985490Z test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_uint8 PASSED [0.1224s] [ 20%] 2024-08-08T21:29:16.1986603Z test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_float32 PASSED [0.0625s] [ 20%] 2024-08-08T21:29:16.1987735Z test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int8 PASSED [0.0477s] [ 20%] 2024-08-08T21:29:16.1988879Z test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex32 PASSED [0.0058s] [ 20%] 2024-08-08T21:29:16.1990032Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_int16 PASSED [0.0165s] [ 21%] 2024-08-08T21:29:16.1991229Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_complex128 PASSED [0.4368s] [ 21%] 2024-08-08T21:29:16.1992501Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_complex32 PASSED [0.0081s] [ 21%] 2024-08-08T21:29:16.1993678Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int32 PASSED [0.0075s] [ 21%] 2024-08-08T21:29:16.1994884Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float64 PASSED [1.2118s] [ 21%] 2024-08-08T21:29:16.1996119Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int64 PASSED [0.0103s] [ 21%] 2024-08-08T21:29:16.1997342Z test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex64 PASSED [0.1405s] [ 21%] 2024-08-08T21:29:16.1998543Z test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_complex128 PASSED [0.2003s] [ 22%] 2024-08-08T21:29:16.1999721Z test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex128 PASSED [0.0803s] [ 22%] 2024-08-08T21:29:16.2000899Z test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_float64 PASSED [0.0370s] [ 22%] 2024-08-08T21:29:16.2002066Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex128 PASSED [0.2473s] [ 22%] 2024-08-08T21:29:16.2003232Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex64 PASSED [0.2453s] [ 22%] 2024-08-08T21:29:16.2004405Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int16 PASSED [0.0501s] [ 22%] 2024-08-08T21:29:16.2005526Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int32 PASSED [0.0501s] [ 22%] 2024-08-08T21:29:16.2006634Z test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_uint8 PASSED [0.0499s] [ 23%] 2024-08-08T21:29:16.2007773Z test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int8 PASSED [0.0502s] [ 23%] 2024-08-08T21:29:16.2008912Z test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int16 PASSED [0.0501s] [ 23%] 2024-08-08T21:29:16.2010057Z test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int32 PASSED [0.0499s] [ 23%] 2024-08-08T21:29:16.2011229Z test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex128 PASSED [0.0140s] [ 23%] 2024-08-08T21:29:16.2012394Z test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex64 PASSED [0.0132s] [ 23%] 2024-08-08T21:29:16.2013555Z test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float16 PASSED [0.0101s] [ 23%] 2024-08-08T21:29:16.2014930Z test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_uint8 SKIPPED [0.0023s] (item in torch.uint8 not supported) [ 24%] 2024-08-08T21:29:16.2016747Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int16 PASSED [0.0106s] [ 24%] 2024-08-08T21:29:16.2018098Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bfloat16 PASSED [0.0109s] [ 24%] 2024-08-08T21:29:16.2019671Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool PASSED [0.0107s] [ 24%] 2024-08-08T21:29:16.2021077Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_complex128 PASSED [0.0129s] [ 24%] 2024-08-08T21:29:16.2022381Z test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_int8 PASSED [0.0062s] [ 24%] 2024-08-08T21:29:16.2023569Z test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_float32 PASSED [0.1430s] [ 24%] 2024-08-08T21:29:16.2024721Z test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_uint8 PASSED [0.5056s] [ 25%] 2024-08-08T21:29:16.2025834Z test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_int8 PASSED [0.1234s] [ 25%] 2024-08-08T21:29:16.2027003Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_cuda_float32 PASSED [2.0108s] [ 25%] 2024-08-08T21:29:16.2028253Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64 PASSED [2.0462s] [ 25%] 2024-08-08T21:29:16.2029499Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_complex128 PASSED [1.8005s] [ 25%] 2024-08-08T21:29:16.2030702Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int64 PASSED [0.2008s] [ 25%] 2024-08-08T21:29:16.2031953Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int64 PASSED [0.0662s] [ 25%] 2024-08-08T21:29:16.2033177Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_complex128 PASSED [2.6641s] [ 26%] 2024-08-08T21:29:16.2034392Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_float64 PASSED [0.3660s] [ 26%] 2024-08-08T21:29:16.2035602Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_float32 PASSED [0.3168s] [ 26%] 2024-08-08T21:29:16.2036821Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32 PASSED [0.0125s] [ 26%] 2024-08-08T21:29:16.2038070Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_solve_cuda_float32 PASSED [0.0179s] [ 26%] 2024-08-08T21:29:16.2039287Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_complex64 PASSED [9.6179s] [ 26%] 2024-08-08T21:29:16.2040501Z test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_complex64 PASSED [3.5446s] [ 26%] 2024-08-08T21:29:16.2041659Z test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_bool PASSED [0.0104s] [ 27%] 2024-08-08T21:29:16.2042793Z test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_float64 PASSED [0.0344s] [ 27%] 2024-08-08T21:29:16.2043930Z test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int32 PASSED [0.0097s] [ 27%] 2024-08-08T21:29:16.2045056Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bfloat16 PASSED [0.0108s] [ 27%] 2024-08-08T21:29:16.2046170Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bool PASSED [0.0100s] [ 27%] 2024-08-08T21:29:16.2047304Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex64 PASSED [0.0615s] [ 27%] 2024-08-08T21:29:16.2048530Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float64 PASSED [0.7934s] [ 27%] 2024-08-08T21:29:16.2049947Z test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_uint8 PASSED [0.6810s] [ 28%] 2024-08-08T21:29:16.2051206Z test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_complex128 PASSED [0.9214s] [ 28%] 2024-08-08T21:29:16.2052513Z test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float16 PASSED [0.0073s] [ 28%] 2024-08-08T21:29:16.2053707Z test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_complex64 PASSED [1.0100s] [ 28%] 2024-08-08T21:29:16.2054868Z test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_float64 PASSED [0.5255s] [ 28%] 2024-08-08T21:29:16.2056068Z test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex128 PASSED [0.0356s] [ 28%] 2024-08-08T21:29:16.2057279Z test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_float32 PASSED [0.0216s] [ 28%] 2024-08-08T21:29:16.2058485Z test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_complex64 PASSED [0.9358s] [ 29%] 2024-08-08T21:29:16.2059690Z test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_complex64 PASSED [6.7074s] [ 29%] 2024-08-08T21:29:16.2060965Z test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_float64 PASSED [29.0225s] [ 29%] 2024-08-08T21:29:16.2062214Z test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_float64 PASSED [1.4797s] [ 29%] 2024-08-08T21:29:16.2063361Z test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float64 PASSED [0.0298s] [ 29%] 2024-08-08T21:29:16.2064484Z test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_uint8 PASSED [0.0217s] [ 29%] 2024-08-08T21:29:16.2065605Z test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_bfloat16 PASSED [0.0862s] [ 29%] 2024-08-08T21:29:16.2066717Z test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_bool PASSED [0.2310s] [ 30%] 2024-08-08T21:29:16.2067850Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int16 PASSED [0.6264s] [ 30%] 2024-08-08T21:29:16.2069050Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float64 PASSED [0.6292s] [ 30%] 2024-08-08T21:29:16.2070266Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_uint8 PASSED [0.3753s] [ 30%] 2024-08-08T21:29:16.2071474Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_bfloat16 PASSED [1.5472s] [ 30%] 2024-08-08T21:29:16.2072766Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_float64 PASSED [20.4809s] [ 30%] 2024-08-08T21:29:16.2073991Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_complex64 PASSED [0.7363s] [ 30%] 2024-08-08T21:29:16.2075195Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_int16 PASSED [0.0937s] [ 31%] 2024-08-08T21:29:16.2076403Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_float16 PASSED [0.8195s] [ 31%] 2024-08-08T21:29:16.2077617Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_int64 PASSED [0.9926s] [ 31%] 2024-08-08T21:29:16.2078814Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_median_cuda_float16 PASSED [0.1477s] [ 31%] 2024-08-08T21:29:16.2080026Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_bfloat16 PASSED [6.9477s] [ 31%] 2024-08-08T21:29:16.2081264Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_complex128 PASSED [15.9340s] [ 31%] 2024-08-08T21:29:16.2082509Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_float64 PASSED [18.8379s] [ 31%] 2024-08-08T21:29:16.2083793Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int64 PASSED [0.7177s] [ 32%] 2024-08-08T21:29:16.2084963Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_uint8 PASSED [0.8512s] [ 32%] 2024-08-08T21:29:16.2086253Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_float16 PASSED [0.0245s] [ 32%] 2024-08-08T21:29:16.2087484Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmax_cuda_float32 PASSED [3.6759s] [ 32%] 2024-08-08T21:29:16.2088690Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_bfloat16 PASSED [1.2652s] [ 32%] 2024-08-08T21:29:16.2089876Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_float64 PASSED [12.0157s] [ 32%] 2024-08-08T21:29:16.2091070Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_bfloat16 PASSED [1.1955s] [ 32%] 2024-08-08T21:29:16.2092266Z test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_float64 PASSED [15.2177s] [ 33%] 2024-08-08T21:29:16.2093474Z test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_complex64 PASSED [0.6943s] [ 33%] 2024-08-08T21:29:16.2094674Z test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float64 PASSED [0.3433s] [ 33%] 2024-08-08T21:29:16.2095852Z test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_int32 PASSED [0.1526s] [ 33%] 2024-08-08T21:29:16.2097071Z test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int16 PASSED [0.0053s] [ 33%] 2024-08-08T21:29:16.2098351Z test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_float64 PASSED [0.1369s] [ 33%] 2024-08-08T21:29:16.2099560Z test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float16 PASSED [0.8851s] [ 34%] 2024-08-08T21:29:16.2100718Z test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_int64 PASSED [0.1568s] [ 34%] 2024-08-08T21:29:16.2102206Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_complex128 SKIPPED [0.0025s] (meshgrid in torch.complex128 not supported) [ 34%] 2024-08-08T21:29:16.2103933Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int64 SKIPPED [0.0023s] (meshgrid in torch.int64 not supported) [ 34%] 2024-08-08T21:29:16.2105667Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_complex128 SKIPPED [0.0023s] (meshgrid in torch.complex128 not supported) [ 34%] 2024-08-08T21:29:16.2107405Z test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int16 SKIPPED [0.0023s] (meshgrid in torch.int16 not supported) [ 34%] 2024-08-08T21:29:16.2108865Z test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bool PASSED [0.0052s] [ 34%] 2024-08-08T21:29:16.2110057Z test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_uint8 PASSED [0.1847s] [ 35%] 2024-08-08T21:29:16.2111184Z test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_int32 PASSED [0.0540s] [ 35%] 2024-08-08T21:29:16.2112391Z test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int16 PASSED [0.0209s] [ 35%] 2024-08-08T21:29:16.2113541Z test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_bfloat16 PASSED [0.0210s] [ 35%] 2024-08-08T21:29:16.2114678Z test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float32 PASSED [0.0842s] [ 35%] 2024-08-08T21:29:16.2116170Z test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float64 PASSED [0.0611s] [ 35%] 2024-08-08T21:29:16.2117302Z test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float64 PASSED [0.0912s] [ 35%] 2024-08-08T21:29:16.2118626Z test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_complex128 PASSED [4.4049s] [ 36%] 2024-08-08T21:29:16.2119783Z test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_float32 PASSED [2.4482s] [ 36%] 2024-08-08T21:29:16.2121064Z test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float64 PASSED [0.3437s] [ 36%] 2024-08-08T21:29:16.2122253Z test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_complex32 PASSED [1.6578s] [ 36%] 2024-08-08T21:29:16.2123390Z test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int8 PASSED [0.5278s] [ 36%] 2024-08-08T21:29:16.2124581Z test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex32 PASSED [0.2134s] [ 36%] 2024-08-08T21:29:16.2125789Z test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float32 PASSED [0.0986s] [ 36%] 2024-08-08T21:29:16.2127025Z test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_bfloat16 PASSED [0.4085s] [ 37%] 2024-08-08T21:29:16.2128187Z test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_bool PASSED [0.1211s] [ 37%] 2024-08-08T21:29:16.2129540Z test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int32 SKIPPED [0.0023s] (new_empty in torch.int32 not supported) [ 37%] 2024-08-08T21:29:16.2130901Z test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_complex64 PASSED [0.0721s] [ 37%] 2024-08-08T21:29:16.2132199Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float16 PASSED [0.0204s] [ 37%] 2024-08-08T21:29:16.2133581Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float64 PASSED [0.0233s] [ 37%] 2024-08-08T21:29:16.2134971Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float32 PASSED [0.0202s] [ 37%] 2024-08-08T21:29:16.2136414Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float64 PASSED [3.4779s] [ 38%] 2024-08-08T21:29:16.2137827Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int16 PASSED [0.3163s] [ 38%] 2024-08-08T21:29:16.2139126Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_float32 PASSED [0.2587s] [ 38%] 2024-08-08T21:29:16.2140460Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_complex32 PASSED [5.9913s] [ 38%] 2024-08-08T21:29:16.2141752Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_float32 PASSED [1.6389s] [ 38%] 2024-08-08T21:29:16.2143084Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_bfloat16 PASSED [0.1249s] [ 38%] 2024-08-08T21:29:16.2144443Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64 PASSED [0.4780s] [ 38%] 2024-08-08T21:29:16.2145822Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int8 PASSED [0.3285s] [ 39%] 2024-08-08T21:29:16.2147159Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32 PASSED [0.5993s] [ 39%] 2024-08-08T21:29:16.2148468Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_bfloat16 PASSED [0.0572s] [ 39%] 2024-08-08T21:29:16.2149747Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float16 PASSED [0.0607s] [ 39%] 2024-08-08T21:29:16.2151144Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [2.5199s] [ 39%] 2024-08-08T21:29:16.2152884Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float32 PASSED [0.0150s] [ 39%] 2024-08-08T21:29:16.2154427Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16 PASSED [0.0537s] [ 39%] 2024-08-08T21:29:16.2155829Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_bfloat16 PASSED [0.0698s] [ 40%] 2024-08-08T21:29:16.2157213Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float16 PASSED [10.6026s] [ 40%] 2024-08-08T21:29:16.2158557Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_float64 PASSED [7.3508s] [ 40%] 2024-08-08T21:29:16.2159852Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float16 PASSED [0.1485s] [ 40%] 2024-08-08T21:29:16.2161133Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int8 PASSED [0.1741s] [ 40%] 2024-08-08T21:29:16.2162466Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float64 PASSED [2.4461s] [ 40%] 2024-08-08T21:29:16.2163842Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_bfloat16 PASSED [0.4417s] [ 40%] 2024-08-08T21:29:16.2165202Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32 PASSED [3.2083s] [ 41%] 2024-08-08T21:29:16.2166614Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float32 PASSED [2.7243s] [ 41%] 2024-08-08T21:29:16.2168032Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64 PASSED [2.2307s] [ 41%] 2024-08-08T21:29:16.2169378Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float32 PASSED [19.5139s] [ 41%] 2024-08-08T21:29:16.2170706Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float32 PASSED [5.2209s] [ 41%] 2024-08-08T21:29:16.2172047Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float32 PASSED [0.4254s] [ 41%] 2024-08-08T21:29:16.2173349Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float16 PASSED [0.0693s] [ 41%] 2024-08-08T21:29:16.2174711Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32 PASSED [0.8404s] [ 42%] 2024-08-08T21:29:16.2176131Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_bfloat16 PASSED [0.3574s] [ 42%] 2024-08-08T21:29:16.2177544Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16 PASSED [0.2766s] [ 42%] 2024-08-08T21:29:16.2178931Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_complex64 PASSED [5.0177s] [ 42%] 2024-08-08T21:29:16.2180253Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float64 PASSED [3.4649s] [ 42%] 2024-08-08T21:29:16.2181813Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_bfloat16 SKIPPED [0.0002s] (Expected: new_empty_strided is not comparable) [ 42%] 2024-08-08T21:29:16.2183359Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_uint8 PASSED [0.1594s] [ 42%] 2024-08-08T21:29:16.2184731Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_complex128 PASSED [0.1919s] [ 43%] 2024-08-08T21:29:16.2186403Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int64 SKIPPED [0.0024s] (nn.functional.relu6 in torch.int64 not supported) [ 43%] 2024-08-08T21:29:16.2187896Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_bfloat16 PASSED [0.0834s] [ 43%] 2024-08-08T21:29:16.2189220Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_cuda_float64 PASSED [0.1894s] [ 43%] 2024-08-08T21:29:16.2190526Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16 PASSED [0.2544s] [ 43%] 2024-08-08T21:29:16.2191923Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float64 PASSED [0.8218s] [ 43%] 2024-08-08T21:29:16.2193534Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float16 SKIPPED [0.0023s] (nn.functional.softshrink in torch.float16 not supported) [ 43%] 2024-08-08T21:29:16.2195127Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_complex64 PASSED [0.3318s] [ 44%] 2024-08-08T21:29:16.2196418Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_uint8 PASSED [0.0179s] [ 44%] 2024-08-08T21:29:16.2197714Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float32 PASSED [0.1379s] [ 44%] 2024-08-08T21:29:16.2199103Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int8 PASSED [1.4618s] [ 44%] 2024-08-08T21:29:16.2200529Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float32 PASSED [2.0845s] [ 44%] 2024-08-08T21:29:16.2201892Z test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float16 PASSED [0.2137s] [ 44%] 2024-08-08T21:29:16.2203289Z test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 44%] 2024-08-08T21:29:16.2204683Z test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float64 SKIPPED [0.0009s] (Only runs on cpu) [ 45%] 2024-08-08T21:29:16.2206094Z test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_int16 SKIPPED [0.0009s] (Only runs on cpu) [ 45%] 2024-08-08T21:29:16.2207576Z test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_complex64 SKIPPED [0.0022s] (norm in torch.complex64 not supported) [ 45%] 2024-08-08T21:29:16.2209125Z test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_float16 SKIPPED [0.0023s] (norm in torch.float16 not supported) [ 45%] 2024-08-08T21:29:16.2210636Z test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float64 SKIPPED [0.0023s] (norm in torch.float64 not supported) [ 45%] 2024-08-08T21:29:16.2211975Z test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex128 PASSED [0.0212s] [ 45%] 2024-08-08T21:29:16.2213129Z test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex32 PASSED [0.0208s] [ 45%] 2024-08-08T21:29:16.2214282Z test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_float32 PASSED [0.0609s] [ 46%] 2024-08-08T21:29:16.2215837Z test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_uint8 PASSED [0.0109s] [ 46%] 2024-08-08T21:29:16.2217371Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float64 PASSED [0.0738s] [ 46%] 2024-08-08T21:29:16.2218659Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int16 PASSED [0.0120s] [ 46%] 2024-08-08T21:29:16.2219934Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bool PASSED [0.0086s] [ 46%] 2024-08-08T21:29:16.2221357Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_uint8 PASSED [0.0089s] [ 46%] 2024-08-08T21:29:16.2222643Z test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16 PASSED [0.0219s] [ 46%] 2024-08-08T21:29:16.2223990Z test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float64 PASSED [0.0049s] [ 47%] 2024-08-08T21:29:16.2225160Z test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_complex128 PASSED [0.3181s] [ 47%] 2024-08-08T21:29:16.2226318Z test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_bfloat16 PASSED [0.0072s] [ 47%] 2024-08-08T21:29:16.2227484Z test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float64 PASSED [0.0118s] [ 47%] 2024-08-08T21:29:16.2228640Z test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int16 PASSED [0.0185s] [ 47%] 2024-08-08T21:29:16.2229767Z test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int32 PASSED [0.0182s] [ 47%] 2024-08-08T21:29:16.2230892Z test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int64 PASSED [0.0178s] [ 47%] 2024-08-08T21:29:16.2232125Z test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_bfloat16 PASSED [0.0059s] [ 48%] 2024-08-08T21:29:16.2233253Z test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_bool PASSED [0.0398s] [ 48%] 2024-08-08T21:29:16.2234378Z test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex64 PASSED [0.0853s] [ 48%] 2024-08-08T21:29:16.2235507Z test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int64 PASSED [0.0054s] [ 48%] 2024-08-08T21:29:16.2236634Z test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int16 PASSED [0.1270s] [ 48%] 2024-08-08T21:29:16.2237780Z test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_uint8 PASSED [0.1262s] [ 48%] 2024-08-08T21:29:16.2238915Z test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int64 PASSED [0.1549s] [ 48%] 2024-08-08T21:29:16.2240039Z test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_uint8 PASSED [0.1547s] [ 49%] 2024-08-08T21:29:16.2241242Z test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex32 PASSED [0.1369s] [ 49%] 2024-08-08T21:29:16.2242445Z test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int32 PASSED [0.0278s] [ 49%] 2024-08-08T21:29:16.2243598Z test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int32 PASSED [0.0594s] [ 49%] 2024-08-08T21:29:16.2244772Z test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_float64 PASSED [0.0057s] [ 49%] 2024-08-08T21:29:16.2245970Z test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_int16 PASSED [0.0054s] [ 49%] 2024-08-08T21:29:16.2247130Z test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float32 PASSED [3.1594s] [ 49%] 2024-08-08T21:29:16.2248257Z test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int16 PASSED [1.1091s] [ 50%] 2024-08-08T21:29:16.2249388Z test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float32 PASSED [0.0714s] [ 50%] 2024-08-08T21:29:16.2250502Z test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_float16 PASSED [0.0156s] [ 50%] 2024-08-08T21:29:16.2251631Z test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float16 PASSED [0.0520s] [ 50%] 2024-08-08T21:29:16.2252808Z test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float32 PASSED [0.0058s] [ 50%] 2024-08-08T21:29:16.2254132Z test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int16 PASSED [0.0190s] [ 50%] 2024-08-08T21:29:16.2255369Z test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int64 PASSED [0.0186s] [ 50%] 2024-08-08T21:29:16.2256689Z test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_float64 PASSED [2.5674s] [ 51%] 2024-08-08T21:29:16.2257878Z test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int32 PASSED [0.0066s] [ 51%] 2024-08-08T21:29:16.2258994Z test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int16 PASSED [0.0526s] [ 51%] 2024-08-08T21:29:16.2260101Z test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_float16 PASSED [0.0067s] [ 51%] 2024-08-08T21:29:16.2261342Z test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_exponential_cuda_float32 PASSED [0.2624s] [ 51%] 2024-08-08T21:29:16.2262693Z test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float64 PASSED [0.4592s] [ 51%] 2024-08-08T21:29:16.2263917Z test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_bool PASSED [0.0295s] [ 51%] 2024-08-08T21:29:16.2265062Z test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int64 PASSED [0.0292s] [ 52%] 2024-08-08T21:29:16.2266207Z test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex128 PASSED [0.3214s] [ 52%] 2024-08-08T21:29:16.2267337Z test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_float16 PASSED [0.0095s] [ 52%] 2024-08-08T21:29:16.2268454Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float32 PASSED [0.1936s] [ 52%] 2024-08-08T21:29:16.2269565Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int8 PASSED [0.0787s] [ 52%] 2024-08-08T21:29:16.2270684Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float16 PASSED [0.0107s] [ 52%] 2024-08-08T21:29:16.2271873Z test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float64 PASSED [0.1511s] [ 52%] 2024-08-08T21:29:16.2273012Z test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bfloat16 PASSED [0.3187s] [ 53%] 2024-08-08T21:29:16.2274175Z test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex128 PASSED [2.8169s] [ 53%] 2024-08-08T21:29:16.2275356Z test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_int32 PASSED [3.0001s] [ 53%] 2024-08-08T21:29:16.2276594Z test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex64 PASSED [0.2186s] [ 53%] 2024-08-08T21:29:16.2277840Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_bool PASSED [0.0075s] [ 53%] 2024-08-08T21:29:16.2279071Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_float32 PASSED [0.0067s] [ 53%] 2024-08-08T21:29:16.2280308Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int8 PASSED [0.0073s] [ 53%] 2024-08-08T21:29:16.2281599Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32 PASSED [0.0119s] [ 54%] 2024-08-08T21:29:16.2282858Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_bool PASSED [0.0957s] [ 54%] 2024-08-08T21:29:16.2284037Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_uint8 PASSED [0.0165s] [ 54%] 2024-08-08T21:29:16.2285310Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_uint8 PASSED [0.0122s] [ 54%] 2024-08-08T21:29:16.2286626Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_uint8 PASSED [0.0124s] [ 54%] 2024-08-08T21:29:16.2287980Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_bool PASSED [0.0104s] [ 54%] 2024-08-08T21:29:16.2289151Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_float64 PASSED [0.1363s] [ 54%] 2024-08-08T21:29:16.2290405Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int16 PASSED [0.0091s] [ 55%] 2024-08-08T21:29:16.2291573Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_float64 PASSED [0.1705s] [ 55%] 2024-08-08T21:29:16.2292849Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32 PASSED [0.0106s] [ 55%] 2024-08-08T21:29:16.2294203Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8 PASSED [0.0122s] [ 55%] 2024-08-08T21:29:16.2295471Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int16 PASSED [0.1131s] [ 55%] 2024-08-08T21:29:16.2296678Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int8 PASSED [0.1102s] [ 55%] 2024-08-08T21:29:16.2297949Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float32 PASSED [0.0067s] [ 55%] 2024-08-08T21:29:16.2299276Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int64 PASSED [0.0076s] [ 56%] 2024-08-08T21:29:16.2300585Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float32 PASSED [0.0067s] [ 56%] 2024-08-08T21:29:16.2301901Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int32 PASSED [0.0079s] [ 56%] 2024-08-08T21:29:16.2303198Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int8 PASSED [0.0077s] [ 56%] 2024-08-08T21:29:16.2304439Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int32 PASSED [0.0124s] [ 56%] 2024-08-08T21:29:16.2305743Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bool PASSED [0.0094s] [ 56%] 2024-08-08T21:29:16.2307140Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_bool PASSED [0.0073s] [ 56%] 2024-08-08T21:29:16.2308509Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int16 PASSED [0.0055s] [ 57%] 2024-08-08T21:29:16.2309873Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int32 PASSED [0.0054s] [ 57%] 2024-08-08T21:29:16.2311226Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int8 PASSED [0.0054s] [ 57%] 2024-08-08T21:29:16.2312691Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int8 PASSED [0.0074s] [ 57%] 2024-08-08T21:29:16.2314392Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32 SKIPPED [0.0002s] (Skipping - testing takes an unreasonably long time, #79528) [ 57%] 2024-08-08T21:29:16.2316795Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float64 SKIPPED [0.0001s] (Skipping - testing takes an unreasonably long time, #79528) [ 57%] 2024-08-08T21:29:16.2318795Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64 SKIPPED [0.0001s] (Skipping - testing takes an unreasonably long time, #79528) [ 57%] 2024-08-08T21:29:16.2320436Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int8 PASSED [0.0119s] [ 58%] 2024-08-08T21:29:16.2321865Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_float32 PASSED [6.2023s] [ 58%] 2024-08-08T21:29:16.2323089Z test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_uint8 PASSED [2.8436s] [ 58%] 2024-08-08T21:29:16.2324379Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_bfloat16 PASSED [0.0114s] [ 58%] 2024-08-08T21:29:16.2325542Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float64 PASSED [0.1775s] [ 58%] 2024-08-08T21:29:16.2326680Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int64 PASSED [0.0867s] [ 58%] 2024-08-08T21:29:16.2327893Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_float16 PASSED [0.0239s] [ 58%] 2024-08-08T21:29:16.2329161Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex128 PASSED [0.9292s] [ 59%] 2024-08-08T21:29:16.2330445Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex32 PASSED [0.3014s] [ 59%] 2024-08-08T21:29:16.2331697Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_float32 PASSED [0.3164s] [ 59%] 2024-08-08T21:29:16.2332934Z test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_int8 PASSED [0.1306s] [ 59%] 2024-08-08T21:29:16.2334106Z test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_bool PASSED [0.0298s] [ 59%] 2024-08-08T21:29:16.2335213Z test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int8 PASSED [0.0299s] [ 59%] 2024-08-08T21:29:16.2336354Z test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_bfloat16 PASSED [0.0201s] [ 59%] 2024-08-08T21:29:16.2337555Z test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int64 PASSED [0.0295s] [ 60%] 2024-08-08T21:29:16.2338764Z test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float32 PASSED [3.0098s] [ 60%] 2024-08-08T21:29:16.2339905Z test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_int16 PASSED [0.0538s] [ 60%] 2024-08-08T21:29:16.2341071Z test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float64 PASSED [0.1362s] [ 60%] 2024-08-08T21:29:16.2342239Z test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int16 PASSED [0.0645s] [ 60%] 2024-08-08T21:29:16.2343392Z test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_int32 PASSED [0.0172s] [ 60%] 2024-08-08T21:29:16.2344516Z test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_uint8 PASSED [0.0124s] [ 60%] 2024-08-08T21:29:16.2345672Z test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_int64 PASSED [0.0233s] [ 61%] 2024-08-08T21:29:16.2346900Z test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float64 PASSED [7.1702s] [ 61%] 2024-08-08T21:29:16.2348100Z test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int64 PASSED [0.3306s] [ 61%] 2024-08-08T21:29:16.2349263Z test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex64 PASSED [6.2175s] [ 61%] 2024-08-08T21:29:16.2350382Z test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int64 PASSED [4.5563s] [ 61%] 2024-08-08T21:29:16.2351754Z test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float64 PASSED [0.9491s] [ 61%] 2024-08-08T21:29:16.2353019Z test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_int8 PASSED [0.0099s] [ 61%] 2024-08-08T21:29:16.2354184Z test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float32 PASSED [0.5027s] [ 62%] 2024-08-08T21:29:16.2355421Z test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int32 PASSED [0.2275s] [ 62%] 2024-08-08T21:29:16.2356546Z test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int8 PASSED [0.1180s] [ 62%] 2024-08-08T21:29:16.2357661Z test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_bool PASSED [0.4994s] [ 62%] 2024-08-08T21:29:16.2358890Z test_decomp.py::TestDecompCUDA::test_comprehensive_tril_indices_cuda_int32 PASSED [2.6711s] [ 62%] 2024-08-08T21:29:16.2360071Z test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_complex128 PASSED [2.6526s] [ 62%] 2024-08-08T21:29:16.2361217Z test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int32 PASSED [0.6312s] [ 62%] 2024-08-08T21:29:16.2362387Z test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex32 PASSED [0.7024s] [ 63%] 2024-08-08T21:29:16.2363552Z test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int32 PASSED [0.1737s] [ 63%] 2024-08-08T21:29:16.2364711Z test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int64 PASSED [0.0339s] [ 63%] 2024-08-08T21:29:16.2365903Z test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_bfloat16 PASSED [0.1919s] [ 63%] 2024-08-08T21:29:16.2367115Z test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_int64 PASSED [4.6773s] [ 63%] 2024-08-08T21:29:16.2368279Z test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_float64 PASSED [5.1150s] [ 63%] 2024-08-08T21:29:16.2369425Z test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int64 PASSED [0.3688s] [ 63%] 2024-08-08T21:29:16.2370560Z test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int8 PASSED [0.3670s] [ 64%] 2024-08-08T21:29:16.2371715Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int64 PASSED [0.3650s] [ 64%] 2024-08-08T21:29:16.2372906Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bool PASSED [0.1270s] [ 64%] 2024-08-08T21:29:16.2374112Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_complex64 PASSED [0.3996s] [ 64%] 2024-08-08T21:29:16.2375354Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float64 PASSED [0.2961s] [ 64%] 2024-08-08T21:29:16.2376535Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_int16 PASSED [0.0386s] [ 64%] 2024-08-08T21:29:16.2377685Z test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_uint8 PASSED [0.0382s] [ 64%] 2024-08-08T21:29:16.2378835Z test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_complex128 PASSED [1.1680s] [ 65%] 2024-08-08T21:29:16.2379971Z test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_float64 PASSED [0.7151s] [ 65%] 2024-08-08T21:29:16.2381157Z test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_complex_cuda_float64 PASSED [0.0054s] [ 65%] 2024-08-08T21:29:16.2382344Z test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float32 PASSED [0.0719s] [ 65%] 2024-08-08T21:29:16.2383495Z test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_uint8 PASSED [0.0518s] [ 65%] 2024-08-08T21:29:16.2384666Z test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_int64 PASSED [0.0266s] [ 65%] 2024-08-08T21:29:16.2386025Z test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_bfloat16 SKIPPED [0.0024s] (zero_ in torch.bfloat16 not supported) [ 65%] 2024-08-08T21:29:16.2387537Z test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int16 SKIPPED [0.0023s] (zero_ in torch.int16 not supported) [ 66%] 2024-08-08T21:29:16.2388925Z test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_uint8 PASSED [0.0107s] [ 66%] 2024-08-08T21:29:16.2390093Z test_decomp.py::TestDecompCUDA::test_quick__batch_norm_with_update_cuda_float16 PASSED [0.0372s] [ 66%] 2024-08-08T21:29:16.2391322Z test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_float32 PASSED [0.3410s] [ 66%] 2024-08-08T21:29:16.2392708Z test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_bool PASSED [0.2003s] [ 66%] 2024-08-08T21:29:16.2393891Z test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_int32 PASSED [0.2002s] [ 66%] 2024-08-08T21:29:16.2395060Z test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_uint8 PASSED [0.2000s] [ 67%] 2024-08-08T21:29:16.2396155Z test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int64 PASSED [0.0281s] [ 67%] 2024-08-08T21:29:16.2397190Z test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_uint8 PASSED [0.0286s] [ 67%] 2024-08-08T21:29:16.2398216Z test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_bool PASSED [0.0097s] [ 67%] 2024-08-08T21:29:16.2399255Z test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_uint8 PASSED [0.0097s] [ 67%] 2024-08-08T21:29:16.2400319Z test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_bfloat16 PASSED [0.0081s] [ 67%] 2024-08-08T21:29:16.2401404Z test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_complex128 PASSED [0.0171s] [ 67%] 2024-08-08T21:29:16.2402467Z test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float64 PASSED [0.1576s] [ 68%] 2024-08-08T21:29:16.2403589Z test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex64 PASSED [0.0878s] [ 68%] 2024-08-08T21:29:16.2404724Z test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_bfloat16 PASSED [0.0234s] [ 68%] 2024-08-08T21:29:16.2405776Z test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_int32 PASSED [0.0296s] [ 68%] 2024-08-08T21:29:16.2406850Z test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int16 PASSED [0.0073s] [ 68%] 2024-08-08T21:29:16.2407945Z test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int64 PASSED [0.0075s] [ 68%] 2024-08-08T21:29:16.2409006Z test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bool PASSED [0.0428s] [ 68%] 2024-08-08T21:29:16.2410045Z test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex64 PASSED [0.1112s] [ 69%] 2024-08-08T21:29:16.2411110Z test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float16 PASSED [0.0291s] [ 69%] 2024-08-08T21:29:16.2412148Z test_decomp.py::TestDecompCUDA::test_quick_any_cuda_bool PASSED [0.0438s] [ 69%] 2024-08-08T21:29:16.2413187Z test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_float64 PASSED [0.0346s] [ 69%] 2024-08-08T21:29:16.2414354Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_bool PASSED [0.0137s] [ 69%] 2024-08-08T21:29:16.2415843Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int8 PASSED [0.0138s] [ 69%] 2024-08-08T21:29:16.2417024Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_bool PASSED [0.0235s] [ 69%] 2024-08-08T21:29:16.2418166Z test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int16 PASSED [0.0233s] [ 70%] 2024-08-08T21:29:16.2419287Z test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bfloat16 PASSED [0.0225s] [ 70%] 2024-08-08T21:29:16.2420343Z test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bool PASSED [0.1019s] [ 70%] 2024-08-08T21:29:16.2421383Z test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_int16 PASSED [0.0975s] [ 70%] 2024-08-08T21:29:16.2422605Z test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex128 PASSED [0.0669s] [ 70%] 2024-08-08T21:29:16.2423697Z test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_bool PASSED [0.1090s] [ 70%] 2024-08-08T21:29:16.2424838Z test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int64 PASSED [0.0191s] [ 70%] 2024-08-08T21:29:16.2426023Z test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int8 PASSED [0.0192s] [ 71%] 2024-08-08T21:29:16.2427113Z test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_uint8 PASSED [0.0186s] [ 71%] 2024-08-08T21:29:16.2428202Z test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_int8 PASSED [0.1151s] [ 71%] 2024-08-08T21:29:16.2429281Z test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_bfloat16 XFAIL [0.0074s] [ 71%] 2024-08-08T21:29:16.2430347Z test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_float32 PASSED [0.3651s] [ 71%] 2024-08-08T21:29:16.2431444Z test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float32 PASSED [0.1382s] [ 71%] 2024-08-08T21:29:16.2432668Z test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_float64 PASSED [0.1388s] [ 71%] 2024-08-08T21:29:16.2433743Z test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int64 PASSED [0.0971s] [ 72%] 2024-08-08T21:29:16.2434817Z test_decomp.py::TestDecompCUDA::test_quick_complex_cuda_float64 PASSED [0.1470s] [ 72%] 2024-08-08T21:29:16.2436233Z test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_uint8 SKIPPED [0.0044s] (only backwards is decomposed, but dtype doesn't support AD) [ 72%] 2024-08-08T21:29:16.2437641Z test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_float16 PASSED [0.0577s] [ 72%] 2024-08-08T21:29:16.2438925Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_glu_cuda_float64 SKIPPED [0.0002s] (Skipped!) [ 72%] 2024-08-08T21:29:16.2440212Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_sinc_cuda_float64 PASSED [3.9357s] [ 72%] 2024-08-08T21:29:16.2441405Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_t_copy_cuda_float64 PASSED [0.3828s] [ 72%] 2024-08-08T21:29:16.2442600Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_vdot_cuda_float64 PASSED [0.4000s] [ 73%] 2024-08-08T21:29:16.2443761Z test_decomp.py::TestDecompCUDA::test_quick_core_backward_zero__cuda_float64 XFAIL [0.0794s] [ 73%] 2024-08-08T21:29:16.2444865Z test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float16 PASSED [0.0079s] [ 73%] 2024-08-08T21:29:16.2445903Z test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_int32 PASSED [0.0105s] [ 73%] 2024-08-08T21:29:16.2446938Z test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_bfloat16 PASSED [0.0073s] [ 73%] 2024-08-08T21:29:16.2447995Z test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_uint8 PASSED [0.0098s] [ 73%] 2024-08-08T21:29:16.2449062Z test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_bool PASSED [0.0449s] [ 73%] 2024-08-08T21:29:16.2450186Z test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float16 PASSED [0.0489s] [ 74%] 2024-08-08T21:29:16.2451304Z test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float32 PASSED [0.0694s] [ 74%] 2024-08-08T21:29:16.2452418Z test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_float64 PASSED [0.1109s] [ 74%] 2024-08-08T21:29:16.2453486Z test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_int8 PASSED [0.0863s] [ 74%] 2024-08-08T21:29:16.2454569Z test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_bfloat16 PASSED [0.0059s] [ 74%] 2024-08-08T21:29:16.2455811Z test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_uint8 SKIPPED [0.0025s] (diag in torch.uint8 not supported) [ 74%] 2024-08-08T21:29:16.2457152Z test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float32 PASSED [0.1102s] [ 74%] 2024-08-08T21:29:16.2458247Z test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_float32 PASSED [0.1476s] [ 75%] 2024-08-08T21:29:16.2459393Z test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_uint8 PASSED [0.0648s] [ 75%] 2024-08-08T21:29:16.2460458Z test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_bool PASSED [0.0097s] [ 75%] 2024-08-08T21:29:16.2461516Z test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int64 PASSED [0.0095s] [ 75%] 2024-08-08T21:29:16.2462568Z test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float16 PASSED [0.0607s] [ 75%] 2024-08-08T21:29:16.2463602Z test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_float64 PASSED [0.0095s] [ 75%] 2024-08-08T21:29:16.2464893Z test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int16 SKIPPED [0.0023s] (empty_like in torch.int16 not supported) [ 75%] 2024-08-08T21:29:16.2466366Z test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int8 SKIPPED [0.0025s] (empty_like in torch.int8 not supported) [ 76%] 2024-08-08T21:29:16.2467609Z test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_complex32 PASSED [0.2443s] [ 76%] 2024-08-08T21:29:16.2468645Z test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_float16 PASSED [0.0796s] [ 76%] 2024-08-08T21:29:16.2469670Z test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int16 PASSED [0.1014s] [ 76%] 2024-08-08T21:29:16.2470691Z test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int8 PASSED [0.0097s] [ 76%] 2024-08-08T21:29:16.2471802Z test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int16 PASSED [0.0306s] [ 76%] 2024-08-08T21:29:16.2472855Z test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int8 PASSED [0.0321s] [ 76%] 2024-08-08T21:29:16.2473894Z test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_float32 PASSED [0.0200s] [ 77%] 2024-08-08T21:29:16.2474986Z test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_complex64 PASSED [0.0595s] [ 77%] 2024-08-08T21:29:16.2476107Z test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_int32 PASSED [0.0232s] [ 77%] 2024-08-08T21:29:16.2477188Z test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float16 PASSED [0.0157s] [ 77%] 2024-08-08T21:29:16.2478268Z test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_complex128 PASSED [0.0653s] [ 77%] 2024-08-08T21:29:16.2479327Z test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int64 PASSED [0.0280s] [ 77%] 2024-08-08T21:29:16.2480359Z test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float32 PASSED [0.1080s] [ 77%] 2024-08-08T21:29:16.2481390Z test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int64 PASSED [0.0866s] [ 78%] 2024-08-08T21:29:16.2482456Z test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex64 PASSED [0.1017s] [ 78%] 2024-08-08T21:29:16.2483885Z test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int64 SKIPPED [0.0044s] (only backwards is decomposed, but dtype doesn't support AD) [ 78%] 2024-08-08T21:29:16.2485257Z test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_complex64 PASSED [0.1199s] [ 78%] 2024-08-08T21:29:16.2486357Z test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float16 PASSED [0.1540s] [ 78%] 2024-08-08T21:29:16.2487738Z test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_uint8 SKIPPED [0.0047s] (only backwards is decomposed, but dtype doesn't support AD) [ 78%] 2024-08-08T21:29:16.2489497Z test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_uint8 SKIPPED [0.0045s] (only backwards is decomposed, but dtype doesn't support AD) [ 78%] 2024-08-08T21:29:16.2490855Z test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_float32 PASSED [0.0312s] [ 79%] 2024-08-08T21:29:16.2492331Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int16 SKIPPED [0.0045s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.2493693Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float16 PASSED [0.0138s] [ 79%] 2024-08-08T21:29:16.2494791Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_float32 PASSED [0.0314s] [ 79%] 2024-08-08T21:29:16.2496183Z test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int8 SKIPPED [0.0044s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.2497843Z test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int16 SKIPPED [0.0047s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.2499500Z test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_uint8 SKIPPED [0.0044s] (only backwards is decomposed, but dtype doesn't support AD) [ 79%] 2024-08-08T21:29:16.2500851Z test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float16 PASSED [0.0145s] [ 80%] 2024-08-08T21:29:16.2501939Z test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float32 PASSED [0.0301s] [ 80%] 2024-08-08T21:29:16.2503019Z test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_float64 PASSED [0.0314s] [ 80%] 2024-08-08T21:29:16.2504071Z test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int16 PASSED [0.0088s] [ 80%] 2024-08-08T21:29:16.2505125Z test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_float16 PASSED [0.0056s] [ 80%] 2024-08-08T21:29:16.2506177Z test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int16 PASSED [0.0289s] [ 80%] 2024-08-08T21:29:16.2507210Z test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int16 PASSED [0.1164s] [ 80%] 2024-08-08T21:29:16.2508248Z test_decomp.py::TestDecompCUDA::test_quick_frexp_cuda_float64 PASSED [0.0184s] [ 81%] 2024-08-08T21:29:16.2509290Z test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int16 PASSED [0.0987s] [ 81%] 2024-08-08T21:29:16.2510353Z test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float64 XFAIL [0.0063s] [ 81%] 2024-08-08T21:29:16.2511388Z test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_int32 PASSED [0.0103s] [ 81%] 2024-08-08T21:29:16.2512499Z test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float16 PASSED [0.0179s] [ 81%] 2024-08-08T21:29:16.2513617Z test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex128 PASSED [0.0351s] [ 81%] 2024-08-08T21:29:16.2514720Z test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int8 PASSED [0.0197s] [ 81%] 2024-08-08T21:29:16.2516204Z test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float32 PASSED [0.0374s] [ 82%] 2024-08-08T21:29:16.2517271Z test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float64 PASSED [0.0375s] [ 82%] 2024-08-08T21:29:16.2518351Z test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float64 PASSED [0.0368s] [ 82%] 2024-08-08T21:29:16.2519402Z test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float32 PASSED [0.1317s] [ 82%] 2024-08-08T21:29:16.2520446Z test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int64 PASSED [0.0122s] [ 82%] 2024-08-08T21:29:16.2521584Z test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_complex64 PASSED [1.0489s] [ 82%] 2024-08-08T21:29:16.2522723Z test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float16 PASSED [0.0568s] [ 82%] 2024-08-08T21:29:16.2523942Z test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_bool PASSED [0.0101s] [ 83%] 2024-08-08T21:29:16.2524991Z test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_float16 PASSED [0.0075s] [ 83%] 2024-08-08T21:29:16.2526175Z test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_bool PASSED [0.0099s] [ 83%] 2024-08-08T21:29:16.2527223Z test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex64 PASSED [0.0193s] [ 83%] 2024-08-08T21:29:16.2528279Z test_decomp.py::TestDecompCUDA::test_quick_log_cuda_bfloat16 PASSED [0.0074s] [ 83%] 2024-08-08T21:29:16.2529346Z test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex64 PASSED [0.0174s] [ 83%] 2024-08-08T21:29:16.2530384Z test_decomp.py::TestDecompCUDA::test_quick_log_cuda_uint8 PASSED [0.0100s] [ 83%] 2024-08-08T21:29:16.2531476Z test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex128 PASSED [0.2506s] [ 84%] 2024-08-08T21:29:16.2532595Z test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int64 PASSED [0.0970s] [ 84%] 2024-08-08T21:29:16.2533707Z test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_complex64 PASSED [0.0156s] [ 84%] 2024-08-08T21:29:16.2534817Z test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int32 PASSED [0.0984s] [ 84%] 2024-08-08T21:29:16.2535904Z test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex64 PASSED [0.4778s] [ 84%] 2024-08-08T21:29:16.2537065Z test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_uint8 PASSED [0.6614s] [ 84%] 2024-08-08T21:29:16.2538217Z test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_float32 PASSED [0.0806s] [ 84%] 2024-08-08T21:29:16.2539266Z test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int16 PASSED [0.0999s] [ 85%] 2024-08-08T21:29:16.2540317Z test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float16 PASSED [0.0159s] [ 85%] 2024-08-08T21:29:16.2541381Z test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int32 PASSED [0.0971s] [ 85%] 2024-08-08T21:29:16.2542423Z test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int16 PASSED [0.0973s] [ 85%] 2024-08-08T21:29:16.2543431Z test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int64 PASSED [0.1108s] [ 85%] 2024-08-08T21:29:16.2544464Z test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_bfloat16 PASSED [0.0060s] [ 85%] 2024-08-08T21:29:16.2545543Z test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float16 PASSED [0.0077s] [ 85%] 2024-08-08T21:29:16.2546928Z test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_float16 SKIPPED [0.0025s] (native_batch_norm in torch.float16 not supported) [ 86%] 2024-08-08T21:29:16.2548367Z test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float64 PASSED [0.3500s] [ 86%] 2024-08-08T21:29:16.2549499Z test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bool PASSED [0.0173s] [ 86%] 2024-08-08T21:29:16.2550575Z test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_float32 PASSED [0.0291s] [ 86%] 2024-08-08T21:29:16.2551716Z test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int32 PASSED [0.0173s] [ 86%] 2024-08-08T21:29:16.2552779Z test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int64 PASSED [0.0173s] [ 86%] 2024-08-08T21:29:16.2553852Z test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_float32 PASSED [0.1551s] [ 86%] 2024-08-08T21:29:16.2555013Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float32 PASSED [0.0685s] [ 87%] 2024-08-08T21:29:16.2556652Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardshrink_cuda_float16 SKIPPED [0.0024s] (nn.functional.hardshrink in torch.float16 not supported) [ 87%] 2024-08-08T21:29:16.2558147Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float16 PASSED [0.0141s] [ 87%] 2024-08-08T21:29:16.2559459Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_logsigmoid_cuda_bfloat16 PASSED [0.0136s] [ 87%] 2024-08-08T21:29:16.2560689Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_bool PASSED [0.2810s] [ 87%] 2024-08-08T21:29:16.2561930Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float64 PASSED [0.3824s] [ 87%] 2024-08-08T21:29:16.2563166Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int16 PASSED [0.2771s] [ 87%] 2024-08-08T21:29:16.2564350Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_int64 PASSED [0.0302s] [ 88%] 2024-08-08T21:29:16.2565805Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_float64 SKIPPED [0.0025s] (nn.functional.rrelu in torch.float64 not supported) [ 88%] 2024-08-08T21:29:16.2567251Z test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex128 PASSED [2.5269s] [ 88%] 2024-08-08T21:29:16.2568636Z test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_complex128 SKIPPED [0.0025s] (norm in torch.complex128 not supported) [ 88%] 2024-08-08T21:29:16.2570113Z test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float32 SKIPPED [0.0024s] (norm in torch.float32 not supported) [ 88%] 2024-08-08T21:29:16.2571389Z test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float32 XFAIL [0.0067s] [ 88%] 2024-08-08T21:29:16.2572516Z test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float64 XFAIL [0.0066s] [ 88%] 2024-08-08T21:29:16.2573601Z test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int32 PASSED [0.0064s] [ 89%] 2024-08-08T21:29:16.2574631Z test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int8 PASSED [0.0059s] [ 89%] 2024-08-08T21:29:16.2575673Z test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_uint8 PASSED [0.0168s] [ 89%] 2024-08-08T21:29:16.2576741Z test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex64 PASSED [0.5819s] [ 89%] 2024-08-08T21:29:16.2577799Z test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_int8 PASSED [0.0287s] [ 89%] 2024-08-08T21:29:16.2578844Z test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float64 XFAIL [0.0061s] [ 89%] 2024-08-08T21:29:16.2579898Z test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_bool PASSED [0.0104s] [ 89%] 2024-08-08T21:29:16.2581012Z test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_complex64 PASSED [0.0175s] [ 90%] 2024-08-08T21:29:16.2582136Z test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float32 PASSED [0.0131s] [ 90%] 2024-08-08T21:29:16.2583239Z test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_bfloat16 PASSED [0.0170s] [ 90%] 2024-08-08T21:29:16.2584344Z test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float64 PASSED [0.1382s] [ 90%] 2024-08-08T21:29:16.2585424Z test_decomp.py::TestDecompCUDA::test_quick_renorm_cuda_float64 PASSED [0.0634s] [ 90%] 2024-08-08T21:29:16.2586483Z test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_int64 PASSED [0.0518s] [ 90%] 2024-08-08T21:29:16.2587532Z test_decomp.py::TestDecompCUDA::test_quick_round_cuda_int16 PASSED [0.0278s] [ 90%] 2024-08-08T21:29:16.2588631Z test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_float16 PASSED [0.0076s] [ 91%] 2024-08-08T21:29:16.2589911Z test_decomp.py::TestDecompCUDA::test_quick_round_decimals_neg_3_cuda_bfloat16 PASSED [0.0084s] [ 91%] 2024-08-08T21:29:16.2591032Z test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_bfloat16 PASSED [0.0152s] [ 91%] 2024-08-08T21:29:16.2592268Z test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float16 PASSED [0.0115s] [ 91%] 2024-08-08T21:29:16.2593367Z test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int16 PASSED [0.0569s] [ 91%] 2024-08-08T21:29:16.2594435Z test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_int16 PASSED [0.0279s] [ 91%] 2024-08-08T21:29:16.2595479Z test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_float16 PASSED [0.0118s] [ 91%] 2024-08-08T21:29:16.2596544Z test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float32 PASSED [0.0383s] [ 92%] 2024-08-08T21:29:16.2597605Z test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_uint8 PASSED [0.0294s] [ 92%] 2024-08-08T21:29:16.2598667Z test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex128 PASSED [0.0659s] [ 92%] 2024-08-08T21:29:16.2599725Z test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_complex32 PASSED [0.0706s] [ 92%] 2024-08-08T21:29:16.2600778Z test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_float32 PASSED [0.0370s] [ 92%] 2024-08-08T21:29:16.2601816Z test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_float64 PASSED [0.0125s] [ 92%] 2024-08-08T21:29:16.2602848Z test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int32 PASSED [0.0283s] [ 92%] 2024-08-08T21:29:16.2603893Z test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_float64 PASSED [0.0980s] [ 93%] 2024-08-08T21:29:16.2604941Z test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_int64 PASSED [0.0807s] [ 93%] 2024-08-08T21:29:16.2606026Z test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_float64 PASSED [0.5287s] [ 93%] 2024-08-08T21:29:16.2607140Z test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int64 PASSED [0.4315s] [ 93%] 2024-08-08T21:29:16.2608254Z test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_uint8 PASSED [0.0101s] [ 93%] 2024-08-08T21:29:16.2609378Z test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int64 PASSED [0.1104s] [ 93%] 2024-08-08T21:29:16.2610500Z test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int8 PASSED [0.1006s] [ 93%] 2024-08-08T21:29:16.2611608Z test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_uint8 PASSED [0.1097s] [ 94%] 2024-08-08T21:29:16.2612671Z test_decomp.py::TestDecompCUDA::test_quick_split_cuda_int8 PASSED [0.0448s] [ 94%] 2024-08-08T21:29:16.2613810Z test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_complex128 PASSED [0.2873s] [ 94%] 2024-08-08T21:29:16.2615059Z test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int64 PASSED [0.1242s] [ 94%] 2024-08-08T21:29:16.2616482Z test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_bfloat16 PASSED [0.0057s] [ 94%] 2024-08-08T21:29:16.2617537Z test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_uint8 PASSED [0.0286s] [ 94%] 2024-08-08T21:29:16.2618588Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_float16 PASSED [0.0118s] [ 94%] 2024-08-08T21:29:16.2619658Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_uint8 PASSED [0.0353s] [ 95%] 2024-08-08T21:29:16.2620763Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int8 PASSED [0.0285s] [ 95%] 2024-08-08T21:29:16.2621901Z test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_uint8 PASSED [0.0291s] [ 95%] 2024-08-08T21:29:16.2623142Z test_decomp.py::TestDecompCUDA::test_quick_std_cuda_complex128 PASSED [0.2229s] [ 95%] 2024-08-08T21:29:16.2624198Z test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_float32 PASSED [0.0780s] [ 95%] 2024-08-08T21:29:16.2625249Z test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_float64 PASSED [0.0130s] [ 95%] 2024-08-08T21:29:16.2626415Z test_decomp.py::TestDecompCUDA::test_quick_t_cuda_complex128 PASSED [0.0172s] [ 95%] 2024-08-08T21:29:16.2627456Z test_decomp.py::TestDecompCUDA::test_quick_t_cuda_float16 PASSED [0.0072s] [ 96%] 2024-08-08T21:29:16.2628481Z test_decomp.py::TestDecompCUDA::test_quick_take_cuda_bool PASSED [0.0208s] [ 96%] 2024-08-08T21:29:16.2629526Z test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_bfloat16 PASSED [0.0073s] [ 96%] 2024-08-08T21:29:16.2630566Z test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int64 PASSED [0.0286s] [ 96%] 2024-08-08T21:29:16.2631705Z test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float32 PASSED [0.0095s] [ 96%] 2024-08-08T21:29:16.2632790Z test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int32 PASSED [0.0108s] [ 96%] 2024-08-08T21:29:16.2633892Z test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int64 PASSED [0.0109s] [ 96%] 2024-08-08T21:29:16.2634959Z test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_int16 PASSED [0.0922s] [ 97%] 2024-08-08T21:29:16.2636003Z test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float16 PASSED [0.0059s] [ 97%] 2024-08-08T21:29:16.2637050Z test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int16 PASSED [0.0276s] [ 97%] 2024-08-08T21:29:16.2638105Z test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex32 PASSED [0.2607s] [ 97%] 2024-08-08T21:29:16.2639200Z test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int8 PASSED [0.7072s] [ 97%] 2024-08-08T21:29:16.2640296Z test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_complex32 PASSED [1.6527s] [ 97%] 2024-08-08T21:29:16.2641354Z test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int32 PASSED [0.7046s] [ 97%] 2024-08-08T21:29:16.2642452Z test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_bool PASSED [0.0367s] [ 98%] 2024-08-08T21:29:16.2643591Z test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex32 PASSED [0.0896s] [ 98%] 2024-08-08T21:29:16.2644788Z test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int64 PASSED [0.0369s] [ 98%] 2024-08-08T21:29:16.2645879Z test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_complex64 PASSED [0.0216s] [ 98%] 2024-08-08T21:29:16.2646959Z test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_bool PASSED [0.0583s] [ 98%] 2024-08-08T21:29:16.2648039Z test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int64 PASSED [0.0577s] [ 98%] 2024-08-08T21:29:16.2649097Z test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int16 PASSED [0.0994s] [ 98%] 2024-08-08T21:29:16.2650376Z test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_complex64 SKIPPED [0.0023s] (zero_ in torch.complex64 not supported) [ 99%] 2024-08-08T21:29:16.2651817Z test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_uint8 SKIPPED [0.0023s] (zero_ in torch.uint8 not supported) [ 99%] 2024-08-08T21:29:16.2653053Z test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex128 PASSED [0.0068s] [ 99%] 2024-08-08T21:29:16.2654116Z test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int64 PASSED [0.0060s] [ 99%] 2024-08-08T21:29:16.2655289Z test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_GRU_train_mode_cuda_float64 PASSED [19.5910s] [ 99%] 2024-08-08T21:29:16.2656691Z test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float32 PASSED [37.0864s] [ 99%] 2024-08-08T21:29:16.2657966Z test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_eval_mode_cuda_float64 PASSED [17.9140s] [100%] 2024-08-08T21:29:16.2658623Z 2024-08-08T21:29:16.2659429Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_decomp/test_decomp-2330f5b1bb70a582.xml - 2024-08-08T21:29:16.2660776Z ============ 648 passed, 43 skipped, 6 xfailed in 513.42s (0:08:33) ============ 2024-08-08T21:29:16.2661504Z Got exit code 0 2024-08-08T21:29:16.2661980Z Test succeeeded in new process, continuing with the rest of the tests 2024-08-08T21:29:16.2662956Z Test results will be stored in test-reports/python-pytest/test_decomp/test_decomp-291178fec05591b3.xml 2024-08-08T21:29:16.2663877Z ============================= test session starts ============================== 2024-08-08T21:29:16.2664789Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T21:29:16.2665482Z cachedir: .pytest_cache 2024-08-08T21:29:16.2666351Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T21:29:16.2667194Z rootdir: /var/lib/jenkins/workspace 2024-08-08T21:29:16.2667590Z configfile: pytest.ini 2024-08-08T21:29:16.2668358Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T21:29:16.2669332Z collecting ... collected 8818 items / 697 deselected / 8121 selected 2024-08-08T21:29:16.2669956Z stepcurrent: skipping 697 already run items. 2024-08-08T21:29:16.2670403Z Running 0 items in this shard 2024-08-08T21:29:16.2670658Z 2024-08-08T21:29:16.2671369Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/test_decomp/test_decomp-291178fec05591b3.xml - 2024-08-08T21:29:16.2672646Z =========================== 697 deselected in 0.68s ============================ 2024-08-08T21:29:16.2673535Z The following tests failed and then succeeded when run in a new process['ul'] 2024-08-08T21:29:16.2674019Z 2024-08-08T21:29:16.2674566Z FINISHED PRINTING LOG FILE of test_decomp 6/13 (test/test-reports/test_decomp_6.13_e203bdf501237379_.log) 2024-08-08T21:29:16.2675175Z 2024-08-08T21:29:19.2229247Z Running inductor/test_torchinductor_opinfo 9/12 ... [2024-08-08 21:29:19.222266] 2024-08-08T21:29:19.2232049Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=9', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:29:19.222652] 2024-08-08T21:31:59.4668503Z 2024-08-08T21:31:59.4670795Z inductor/test_torchinductor_opinfo 1/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_1.12_17d5a0bf28fd4858_.log 2024-08-08T21:31:59.4826435Z Running 285 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmod___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rxor___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__segment_reduce_lengths_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addbmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chalf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_complex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hypot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igammac_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lcm_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_singular_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_svd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorinv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logaddexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matmul_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_dropout_backward_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_zeros_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_celu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_elu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_leaky_relu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_leaky_relu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_prelu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_rms_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softplus_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_qr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_kaiser_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_lowrank_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensordot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_uniform_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_uint32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_complex_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_uint8 2024-08-08T21:31:59.4978948Z 2024-08-08T21:32:02.7168865Z Running functorch/test_ops 4/6 ... [2024-08-08 21:32:02.716253] 2024-08-08T21:32:02.7171209Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ops.py', '-m', 'not serial', '--shard-id=4', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:32:02.716660] 2024-08-08T21:33:30.7528345Z 2024-08-08T21:33:30.7530169Z inductor/test_torchinductor_opinfo 8/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.12_44646b0001d84854_.log 2024-08-08T21:33:30.7676016Z Running 267 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__softmax_backward_data_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_not_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_shapes_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_complex_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lstsq_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorsolve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logaddexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_maximum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softplus_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pinverse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_uniform_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_int32 2024-08-08T21:33:30.7813009Z 2024-08-08T21:33:34.0275951Z Running functorch/test_aotdispatch 1/1 ... [2024-08-08 21:33:34.026914] 2024-08-08T21:33:34.0278350Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aotdispatch.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:33:34.027375] 2024-08-08T21:34:30.3577794Z 2024-08-08T21:34:30.3579654Z functorch/test_aotdispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aotdispatch_1.1_d399b37cd9957e26_.log 2024-08-08T21:34:30.3776787Z Running 433 items in this shard: test/functorch/test_aotdispatch.py::TestAOTAutograd::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_fsdp_set__into_same_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_module, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_ban_dropout_mut_pre_dispatch, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_multiple_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_no_buffer_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_functionalized_rng_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_dupes_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_input_requiring_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_parameter_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_metadata_mutation_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_module_joint, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_multiple_outputs_require_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_buffer_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_inplace, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_linear, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_contiguous, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_conv_and_bn, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_composite_implicit, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_simple, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_view, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_1, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_2, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_outdtype, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_reshape, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_autograd_op, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond_nested, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_basic, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_pytrees_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_synthetic_bases_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_unbacked_arg, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_with_torch_cond, test/functorch/test_aotdispatch.py::TestPartitioning::test_autocast, test/functorch/test_aotdispatch.py::TestPartitioning::test_contiguous, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_getitem, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_generate_gives_inference_graph, test/functorch/test_aotdispatch.py::TestPartitioning::test_meta_tensor_inplace_op, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_save_shape, test/functorch/test_aotdispatch.py::TestPartitioning::test_preserve_random, test/functorch/test_aotdispatch.py::TestPartitioning::test_recompute_partitioning, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_incorrect_backward, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_inference, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad_views, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_simple, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_dynamic, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_fake_tensor_gm_raises, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace_from_mutation, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_inference_python_dispatcher, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_lift_fresh_copy_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_fsdp_set__into_same_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_fsdp_set__into_same_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_detach 2024-08-08T21:34:30.3969213Z 2024-08-08T21:34:33.6227877Z Running inductor/test_cudacodecache 1/1 ... [2024-08-08 21:34:33.622134] 2024-08-08T21:34:33.6230000Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudacodecache.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:34:33.622542] 2024-08-08T21:34:40.4505207Z 2024-08-08T21:34:40.4506967Z inductor/test_cudacodecache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudacodecache_1.1_dbc14e90da05020b_.log 2024-08-08T21:34:40.4511253Z Running 3 items in this shard: test/inductor/test_cudacodecache.py::TestCUDACodeCache::test_async_compile, test/inductor/test_cudacodecache.py::TestCUDACodeCache::test_compilation_error, test/inductor/test_cudacodecache.py::TestCUDACodeCache::test_cuda_load 2024-08-08T21:34:40.4513265Z 2024-08-08T21:34:43.7132044Z Running export/test_retraceability 1/1 ... [2024-08-08 21:34:43.712526] 2024-08-08T21:34:43.7134106Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_retraceability.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:34:43.712913] 2024-08-08T21:35:28.6653723Z 2024-08-08T21:35:28.6655180Z export/test_retraceability 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_retraceability_1.1_b8c2771bf028c6ff_.log 2024-08-08T21:35:28.6752910Z Running 179 items in this shard: test/export/test_retraceability.py::RetraceExportTestDynamismExpression::test_export_assume_static_by_default_retraceability, test/export/test_retraceability.py::RetraceExportTestDynamismExpression::test_export_constraints_error_not_in_range_retraceability, test/export/test_retraceability.py::RetraceExportTestDynamismExpression::test_export_constraints_error_retraceability, test/export/test_retraceability.py::RetraceExportTestDynamismExpression::test_export_inline_constraints_retraceability, test/export/test_retraceability.py::RetraceExportTestDynamismExpression::test_export_slice_maxsize_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test__scaled_dot_product_flash_attention_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_allow_explicit_guards_as_runtime_asserts_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_args_type_checked_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_aten_lift_fresh_copy_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_automatic_constrain_size_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_basic_non_strict_fake_tensor_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_basic_non_strict_real_tensor_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_basic_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_buffer_util_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_check_is_size_error_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_check_specialized_int_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_checks_to_constrain_range_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_colon_parameter_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_compiling_state_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_cond_buffers_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_cond_with_module_stack_export_with_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_cond_with_module_stack_export_with_unflatten_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_constant_aliasing_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_constant_input_naming_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_constant_output_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_constrain_decomp_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_constrain_size_in_eager_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_constrain_size_with_constrain_value_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_constrain_size_with_various_cases_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_conv_dynamic_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_crop_like_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_cse_for_symint_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_custom_op_auto_functionalize_pre_dispatch_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_custom_op_auto_functionalize_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_custom_op_auto_warn_pre_dispatch_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_decomp_batch_norm_functional_predispatch_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_1_2_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_basic_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_integer_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_nested_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_out_of_order_repeat_derived_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_out_of_order_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_out_of_order_simplified_repeat_non_derived_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_out_of_order_simplified_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_derived_dim_repeat_derived_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_device_to_dynamic_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_device_to_mutation_float_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_device_to_mutation_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_device_to_static_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_dim_1_2_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_disable_forced_specializations_errors_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_disable_forced_specializations_ok_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_dynamic_shapes_builder_basic_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_dynamic_shapes_builder_kwargs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_dynamic_shapes_builder_pytree_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_dynamic_shapes_spec_with_pytree_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_error_does_not_reference_eager_fallback_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_api_with_dynamic_shapes_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_as_backend_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_cond_preserve_torch_fn_for_subgraphs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_cond_symbool_pred_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_decomps_dynamic_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_decomps_simple_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_dynamo_config_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_for_training_run_decomp_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_for_training_with_container_type_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_for_training_with_dynamic_shapes_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_for_training_with_mutation_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_func_with_default_kwargs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_func_with_keyword_only_args_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_func_with_kwargs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_func_with_pytree_kwargs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_func_with_var_keyword_args_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_func_with_var_keyword_pytree_args_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_func_with_var_postional_args_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_graph_with_no_inputs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_input_mutation_bug_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_input_mutation_dynamic_shape_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_input_mutation_static_shape_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_mod_constraints_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_predispatch_custom_ops_warnings_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_preserve_linear_at_aot_level_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_then_compile_tensor_ctor_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_with_fake_tensor_inputs_on_cuda_devices_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_with_fake_tensor_inputs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_with_inline_constraints_complex_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_with_inline_constraints_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_export_with_wrong_inputs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_external_call_non_strict_real_tensor_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_fake_inputs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_fake_weights_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_float_conversion_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_fqn_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_if_functional_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_intermediate_shape_comp_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_issue_113041_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_keep_composite_ops_invalid_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_keep_composite_ops_linear_convd_for_training_ir_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_keep_composite_ops_linear_convd_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_layer_sharing_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_lazy_module_kwargs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_lifted_constants_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_linear_conv_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_map_buffers_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_map_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_mixed_input_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_module_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_module_with_dict_container_inp_out_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_multiple_definitions_same_name_dim_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nested_dynamic_shapes_spec_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nested_module_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nested_module_with_constant_buffer_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nested_module_with_init_buffer_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nested_module_with_parameter_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nn_module_stack_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nn_module_stack_shared_submodule_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_no_tensor_computation_2_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_no_tensor_computation_3_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_no_tensor_computation_4_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_no_tensor_computation_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_non_arg_name_dynamic_shapes_api_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_non_arg_name_dynamic_shapes_api_with_container_type_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_non_arg_name_dynamic_shapes_api_with_kwarg_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_non_persistent_buffer_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_non_strict_dynamic_shapes_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_non_strict_dynamic_shapes_suggested_fixes_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nonstrict_retrace_preserves_metadata_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_nonzero_2_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_not_correct_dim_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_pad_sequence_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_param_util_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_placeholder_naming_collisions_hoo_subgraphs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_placeholder_naming_collisions_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_predispatch_cond_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_predispatch_grad_wrappers_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_preserve_requires_grad_placeholders_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_preserve_shape_dynamism_for_unused_inputs_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_pytree_register_data_class_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_pytree_register_nested_data_class_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_raise_user_error_when_guard_on_data_dependent_operation_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_redundant_asserts_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_refine_dynamic_shapes_from_suggested_fixes_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_reshape_view_helper_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_retracable_ep_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_retrace_graph_level_meta_preservation_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_retrace_pre_autograd_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_run_decomposition_supports_user_input_mutation_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_runtime_assert_for_prim_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_runtime_assert_for_prm_str_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_runtime_assert_with_size_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_set_grad_empty_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_setgrad_lifted_tensor_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_simple_export_for_training_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_slice_with_floordiv_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_solver_unsupported_sympy_function_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_specialize_derived_dim_roots_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_split_const_gm_with_lifted_constants_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_stack_trace_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_state_primitives_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_state_tensors_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_static_dim_constraints_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_suggested_fixes_for_data_dependent_errors_basic_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_suggested_fixes_for_data_dependent_errors_puzzlers_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_suggested_fixes_new_roots_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_sym_sqrt_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_symint_tensor_return_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_tensor_attribute_zero_args_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_to_module_with_mutated_buffer_multiple_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_to_module_with_mutated_buffer_multiple_update_sub_later_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_to_module_with_mutated_buffer_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_torch_check_eq_commutativity_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_torch_fn_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_trace_under_fake_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_train_eval_on_exported_preautograd_module_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_unbacked_deferred_runtime_retrace_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_unbacked_slice_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_unflatten_asserts_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_unused_aliases_retraceability, test/export/test_retraceability.py::RetraceExportTestExport::test_user_input_and_buffer_mutation_retraceability 2024-08-08T21:35:28.6843314Z 2024-08-08T21:35:31.8536773Z Running test_functionalization 1/1 ... [2024-08-08 21:35:31.853004] 2024-08-08T21:35:31.8539038Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_functionalization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:35:31.853395] 2024-08-08T21:35:39.2850788Z 2024-08-08T21:35:39.2852888Z test_functionalization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_functionalization_1.1_af9b31041f945614_.log 2024-08-08T21:35:39.2897934Z Running 112 items in this shard: test/test_functionalization.py::TestFunctionalization::test_advanced_indexing, test/test_functionalization.py::TestFunctionalization::test_advanced_indexing_correct_strides, test/test_functionalization.py::TestFunctionalization::test_aliases_maintained_after_pass_when_reapplying_views, test/test_functionalization.py::TestFunctionalization::test_as_strided, test/test_functionalization.py::TestFunctionalization::test_batch_norm, test/test_functionalization.py::TestFunctionalization::test_cat, test/test_functionalization.py::TestFunctionalization::test_channels_last_contiguous, test/test_functionalization.py::TestFunctionalization::test_copy_, test/test_functionalization.py::TestFunctionalization::test_copy_stride_mismatch, test/test_functionalization.py::TestFunctionalization::test_diagonal, test/test_functionalization.py::TestFunctionalization::test_diagonal_mutated_input, test/test_functionalization.py::TestFunctionalization::test_everything, test/test_functionalization.py::TestFunctionalization::test_expand_symint, test/test_functionalization.py::TestFunctionalization::test_fill_, test/test_functionalization.py::TestFunctionalization::test_freeze, test/test_functionalization.py::TestFunctionalization::test_index_mutation_on_non_input, test/test_functionalization.py::TestFunctionalization::test_inplace_on_non_view, test/test_functionalization.py::TestFunctionalization::test_instance_norm, test/test_functionalization.py::TestFunctionalization::test_metadata_change, test/test_functionalization.py::TestFunctionalization::test_metadata_change_out_op, test/test_functionalization.py::TestFunctionalization::test_mixed_wrappers_invalid, test/test_functionalization.py::TestFunctionalization::test_mixed_wrappers_valid, test/test_functionalization.py::TestFunctionalization::test_multi_out, test/test_functionalization.py::TestFunctionalization::test_multiple_views_of_same_base, test/test_functionalization.py::TestFunctionalization::test_mutable_op_not_inplace_or_other, test/test_functionalization.py::TestFunctionalization::test_mutation_overlapping_mem, test/test_functionalization.py::TestFunctionalization::test_nested_functions_propagate_updates, test/test_functionalization.py::TestFunctionalization::test_only_one_view, test/test_functionalization.py::TestFunctionalization::test_optional_tensor_list, test/test_functionalization.py::TestFunctionalization::test_python_functionalization, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_conj, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_is_conj, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_is_neg, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_lift_fresh, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_lift_fresh_storage, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_neg, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_zero_tensor, test/test_functionalization.py::TestFunctionalization::test_reapply_views_simple, test/test_functionalization.py::TestFunctionalization::test_resize_larger_invalid, test/test_functionalization.py::TestFunctionalization::test_resize_larger_valid, test/test_functionalization.py::TestFunctionalization::test_resize_same_size_diff_rank, test/test_functionalization.py::TestFunctionalization::test_resize_smaller, test/test_functionalization.py::TestFunctionalization::test_save_for_backwards_segfault, test/test_functionalization.py::TestFunctionalization::test_scalars, test/test_functionalization.py::TestFunctionalization::test_set_, test/test_functionalization.py::TestFunctionalization::test_simple, test/test_functionalization.py::TestFunctionalization::test_simple_out, test/test_functionalization.py::TestFunctionalization::test_slice, test/test_functionalization.py::TestFunctionalization::test_split, test/test_functionalization.py::TestFunctionalization::test_split_with_sizes, test/test_functionalization.py::TestFunctionalization::test_tensor_ctr, test/test_functionalization.py::TestFunctionalization::test_tensor_list_composite, test/test_functionalization.py::TestFunctionalization::test_tensor_list_mixed_functional_nonfunctional, test/test_functionalization.py::TestFunctionalization::test_unbind, test/test_functionalization.py::TestFunctionalization::test_view_clone_view_inplace, test/test_functionalization.py::TestFunctionalization::test_view_inplace, test/test_functionalization.py::TestCrossRefFunctionalization::test_advanced_indexing, test/test_functionalization.py::TestCrossRefFunctionalization::test_advanced_indexing_correct_strides, test/test_functionalization.py::TestCrossRefFunctionalization::test_aliases_maintained_after_pass_when_reapplying_views, test/test_functionalization.py::TestCrossRefFunctionalization::test_as_strided, test/test_functionalization.py::TestCrossRefFunctionalization::test_batch_norm, test/test_functionalization.py::TestCrossRefFunctionalization::test_cat, test/test_functionalization.py::TestCrossRefFunctionalization::test_channels_last_contiguous, test/test_functionalization.py::TestCrossRefFunctionalization::test_copy_, test/test_functionalization.py::TestCrossRefFunctionalization::test_copy_stride_mismatch, test/test_functionalization.py::TestCrossRefFunctionalization::test_diagonal, test/test_functionalization.py::TestCrossRefFunctionalization::test_diagonal_mutated_input, test/test_functionalization.py::TestCrossRefFunctionalization::test_everything, test/test_functionalization.py::TestCrossRefFunctionalization::test_expand_symint, test/test_functionalization.py::TestCrossRefFunctionalization::test_fill_, test/test_functionalization.py::TestCrossRefFunctionalization::test_freeze, test/test_functionalization.py::TestCrossRefFunctionalization::test_index_mutation_on_non_input, test/test_functionalization.py::TestCrossRefFunctionalization::test_inplace_on_non_view, test/test_functionalization.py::TestCrossRefFunctionalization::test_instance_norm, test/test_functionalization.py::TestCrossRefFunctionalization::test_metadata_change, test/test_functionalization.py::TestCrossRefFunctionalization::test_metadata_change_out_op, test/test_functionalization.py::TestCrossRefFunctionalization::test_mixed_wrappers_invalid, test/test_functionalization.py::TestCrossRefFunctionalization::test_mixed_wrappers_valid, test/test_functionalization.py::TestCrossRefFunctionalization::test_multi_out, test/test_functionalization.py::TestCrossRefFunctionalization::test_multiple_views_of_same_base, test/test_functionalization.py::TestCrossRefFunctionalization::test_mutable_op_not_inplace_or_other, test/test_functionalization.py::TestCrossRefFunctionalization::test_mutation_overlapping_mem, test/test_functionalization.py::TestCrossRefFunctionalization::test_nested_functions_propagate_updates, test/test_functionalization.py::TestCrossRefFunctionalization::test_only_one_view, test/test_functionalization.py::TestCrossRefFunctionalization::test_optional_tensor_list, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_conj, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_is_conj, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_is_neg, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_lift_fresh, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_lift_fresh_storage, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_neg, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_zero_tensor, test/test_functionalization.py::TestCrossRefFunctionalization::test_reapply_views_simple, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_larger_invalid, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_larger_valid, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_same_size_diff_rank, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_smaller, test/test_functionalization.py::TestCrossRefFunctionalization::test_save_for_backwards_segfault, test/test_functionalization.py::TestCrossRefFunctionalization::test_scalars, test/test_functionalization.py::TestCrossRefFunctionalization::test_set_, test/test_functionalization.py::TestCrossRefFunctionalization::test_simple, test/test_functionalization.py::TestCrossRefFunctionalization::test_simple_out, test/test_functionalization.py::TestCrossRefFunctionalization::test_slice, test/test_functionalization.py::TestCrossRefFunctionalization::test_split, test/test_functionalization.py::TestCrossRefFunctionalization::test_split_with_sizes, test/test_functionalization.py::TestCrossRefFunctionalization::test_tensor_ctr, test/test_functionalization.py::TestCrossRefFunctionalization::test_tensor_list_composite, test/test_functionalization.py::TestCrossRefFunctionalization::test_tensor_list_mixed_functional_nonfunctional, test/test_functionalization.py::TestCrossRefFunctionalization::test_unbind, test/test_functionalization.py::TestCrossRefFunctionalization::test_view_clone_view_inplace, test/test_functionalization.py::TestCrossRefFunctionalization::test_view_inplace 2024-08-08T21:35:39.2941980Z 2024-08-08T21:35:42.4761812Z Running test_ops 1/8 ... [2024-08-08 21:35:42.475627] 2024-08-08T21:35:42.4764398Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:35:42.476024] 2024-08-08T21:39:50.2595139Z 2024-08-08T21:39:50.2599548Z inductor/test_torchinductor_opinfo 9/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_9.12_bc50289be67487d1_.log 2024-08-08T21:39:50.2764616Z Running 274 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___ror___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_max_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_max_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igamma_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_fill_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_det_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvals_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_svdvals_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_normalize_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_batch_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_zeros_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_zeros_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nextafter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_elu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_glu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_inf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_quantile_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_renorm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_cosine_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hann_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_kaiser_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_sampled_addmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_unbiased_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tile_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_indices_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unravel_index_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_chunk_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vdot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float64 2024-08-08T21:39:50.2903893Z 2024-08-08T21:39:53.5426429Z Running test_ops 6/8 ... [2024-08-08 21:39:53.542098] 2024-08-08T21:39:53.5429866Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=6', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:39:53.542509] 2024-08-08T21:40:35.6887288Z 2024-08-08T21:40:35.6888914Z functorch/test_ops 4/6 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ops_4.6_c3430c356a52daf4_.log 2024-08-08T21:40:35.7713306Z Running 1692 items in this shard: test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_nll_loss_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_extremal_numerics_softmax_cuda, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_bool_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_broadcast_shapes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_isfinite_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_linspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_logical_xor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nan_to_num_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_new_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_max_unpool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_permute_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_real_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unflatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_grad_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__softmax_backward_data_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp__upsample_bilinear2d_aa_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bfloat16_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_fftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_rfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fft_rfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_isposinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_svdvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_logical_and_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mH_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_mul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_ne_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_conv_transpose1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_cosine_similarity_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_unpool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_pairwise_distance_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_silu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_normal_in_place_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_signal_windows_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_vstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpjvpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_count_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_diff_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_double_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ge_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_histc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_lu_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_lt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_min_reduction_with_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_narrow_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv_transpose2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rad2deg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sparse_sampled_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_chebyshev_polynomial_u_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_hermite_polynomial_h_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_i1e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_special_spherical_bessel_j0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_svd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_zeros_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmap_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_jvpvmapvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_T_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_broadcast_to_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_conj_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_hsplit_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_list_return_unbind_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_resolve_conj_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_select_grad_op_jvp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_view_then_inplace_view_as_grad_op_vjp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_T_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_ZeroGradientsGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_contiguous_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_hypot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_inv_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linalg_vander_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sgn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sinc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_sort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_legendre_polynomial_p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_modified_bessel_i0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_special_ndtri_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_topk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_view_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp__unsafe_masked_index_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_any_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_argwhere_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_combinations_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_constant_pad_nd_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_cumulative_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_einsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_exp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_heaviside_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_eigvals_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_eigvalsh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_meshgrid_list_of_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_conv2d_stride_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_dropout3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_gaussian_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_hinge_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_layer_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_multi_head_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_norm_fro_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_reshape_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_scatter_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_signbit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_i0e_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_triu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unique_consecutive_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjp_zero__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvjpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_atanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_block_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_diagonal_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_digamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_dstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_float_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_fmod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_int_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_kthvalue_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ldexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_linalg_vector_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_long_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_msort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_new_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_channel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_groups_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_conv2d_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_constant_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_pad_replicate_negative_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_rrelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_round_decimals_neg_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_select_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_laguerre_polynomial_l_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_trapezoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_var_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmap_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vjpvmapvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_SortGenVmapAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ZeroGradientsGenVmapAutogradFunction_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rdiv___cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__chunk_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__native_batch_norm_legit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__segment_reduce_lengths_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad__unsafe_masked_index_put_accumulate_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_acosh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addcdiv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addcmul_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmm_decomposed_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_addmv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_atleast_1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_byte_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_cumulative_trapezoid_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diag_embed_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_diagonal_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_div_floor_rounding_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dot_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dsplit_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_dstack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_empty_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_empty_permuted_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_equal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_erf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exp_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expand_copy_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_exponential_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_fftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ifftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_ihfftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_irfft_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_irfftn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_flatten_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_float_power_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_fmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_geometric_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_geqrf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_gradient_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_heaviside_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_hsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_igammac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_index_reduce_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_inner_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_int_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isinf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isnan_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_isneginf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_binary_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_jiterator_binary_return_by_ref_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_cholesky_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_det_singular_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_diagonal_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_eig_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_householder_product_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_ldl_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_lu_factor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_multi_dot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_solve_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_tensorinv_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_linalg_vecdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log10_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_log2_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_not_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_logical_xor_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mH_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mT_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_cumprod_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_mean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_masked_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_maximum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nan_to_num_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nanmean_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_batch_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_native_layer_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ne_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_celu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_depthwise_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_padding_no_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv2d_stride_with_bias_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose2d_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_dropout_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_elu_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_fractional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_group_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hardtanh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_bicubic_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool1d_grad_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_max_unpool2d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multi_head_attention_forward_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multi_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_margin_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pad_replicate_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pairwise_distance_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_pixel_shuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_prelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_rrelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_softmin_with_dtype_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_nonzero_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_norm_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_norm_inf_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_pinverse_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_polar_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_positive_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_put_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rad2deg_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_randn_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_real_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_reshape_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_resize_as__cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_roll_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_scatter_reduce_amax_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_short_functorch_no_channels_last_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_general_cosine_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hann_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_hann_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_kaiser_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_signal_windows_nuttall_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sinh_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_slice_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_slice_scatter_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_softmax_with_dtype_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_sparse_mm_reduce_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_airy_ai_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_j1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_bessel_y0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_chebyshev_polynomial_v_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_i0e_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_legendre_polynomial_p_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_i1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_modified_bessel_k1_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_spherical_bessel_j0_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_special_xlog1py_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_split_with_sizes_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_square_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tan_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_topk_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_transpose_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_trapezoid_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_true_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_var_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_view_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_vstack_cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmap_autograd_grad_zero__cuda_float64, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ForwardHasDefaultArgsAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_SelectAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rdiv___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rmatmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addcdiv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_addmv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_asin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atleast_1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cauchy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_expand_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_CubeGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_SelectGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__segment_reduce_lengths_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule__unsafe_masked_index_put_accumulate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addcmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_addmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_alias_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bfloat16_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_broadcast_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_bucketize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_conj_physical_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_copysign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_diagonal_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_double_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_eq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_eye_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_fft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fft_ihfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_floor_divide_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_isclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_det_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_det_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_log_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nansum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_dropout2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_embedding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_grid_sample_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardswish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_hardtanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_interpolate_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_logsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_nll_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pad_replicate_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_ops_aten_index_put_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_pinverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_remainder_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_round_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_split_list_args_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_tril_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_has_batch_rule_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isnan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_ldl_factor_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_matrix_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_matrix_rank_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_pinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_pinv_hermitian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logaddexp2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_conv_transpose3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_elu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_group_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_instance_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_bicubic_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_mish_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_mse_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_pad_reflect_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_rms_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_nonzero_static_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ops_aten__new_zeros_with_same_feature_meta_functorchonly_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_pow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_ravel_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_resolve_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_scatter_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_sparse_mm_reduce_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_entr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_uniform_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_var_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_vdot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpall_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___getitem___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp___rmul___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_acosh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_addmm_decomposed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_all_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_atan2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_atleast_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_bernoulli_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cov_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_diagflat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_div_no_rounding_mode_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_empty_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_hfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_ihfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_fft_rfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_frac_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_geqrf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_half_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_inner_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isneginf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_isreal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_item_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_cholesky_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_lstsq_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_linspace_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logcumsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_logdet_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_amin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_fill_functorch_Scalar_only_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_normalize_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_masked_var_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_matmul_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_matrix_exp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_max_reduction_no_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_meshgrid_variadic_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_movedim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_narrow_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_conv3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_cross_entropy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_fractional_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_glu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_interpolate_trilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_pixel_unshuffle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_softplus_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_put_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_randint_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_reciprocal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_reshape_as_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_scatter_reduce_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_short_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_softmax_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_airy_ai_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_modified_bessel_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_scaled_modified_bessel_k1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_shifted_chebyshev_polynomial_v_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_shifted_chebyshev_polynomial_w_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_sqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_squeeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_std_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_std_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_transpose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unsafe_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvjp_where_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyCubeNotComposableAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapjvpvmap_SortGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_NumpyTakeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_abs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_argmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_as_strided_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_atleast_3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_bool_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_broadcast_tensors_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ceil_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cholesky_inverse_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_clone_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_cummin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_deg2rad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_diag_embed_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_div_floor_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_empty_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_erfc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_expand_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_hfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_ifft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_fft_irfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_flipud_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_float_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_gather_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_MulGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpyCubeAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___radd___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule___rsub___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_angle_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_arange_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_argsort_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_asinh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_chalf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cholesky_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_corrcoef_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cummax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_diag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_div_trunc_rounding_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_empty_permuted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_expand_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_fft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_fftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_fft_irfft2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_flip_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_floor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_grid_sampler_2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_half_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_hstack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_le_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_lgamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_cross_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_linalg_pinv_singular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log1p_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_log_normal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_min_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mvlgamma_mvlgamma_p_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_mvlgamma_mvlgamma_p_5_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nanquantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_avg_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_avg_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_adaptive_max_pool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_alpha_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_conv2d_stride_padding_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_gelu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_nearest-exact_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_interpolate_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_margin_ranking_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_max_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_multilabel_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_scaled_dot_product_attention_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_smooth_l1_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_triplet_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_nn_functional_upsample_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polar_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polygamma_polygamma_n_2_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_polygamma_polygamma_n_4_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_qr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_randn_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_repeat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_resize_as__cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_rsqrt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_bartlett_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_erfcx_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_i1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_squeeze_multiple_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_sub_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_take_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_tile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_torch_ops_aten__safe_softmax_default_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_trapz_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_unsqueeze_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_has_batch_rule_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_index_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_jiterator_2inputs_2outputs_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_cholesky_ex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_diagonal_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_eigh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_inv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_lu_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_matrix_power_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_solve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_solve_triangular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_linalg_tensorinv_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_log_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logical_or_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logspace_tensor_overload_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_logsumexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_lu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_amax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_masked_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nanmean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nanmedian_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_native_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_native_dropout_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_new_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nextafter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_batch_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_padding_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_conv2d_stride_with_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_dropout_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_embedding_bag_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_hardshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_interpolate_bilinear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_kl_div_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_linear_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool3d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_mse_loss_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_multi_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pad_circular_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_pdist_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_relu6_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_selu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_nn_functional_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_ormqr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_polygamma_polygamma_n_1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_randint_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_renorm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_resolve_neg_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_roll_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_round_decimals_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scalar_tensor_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_scatter_reduce_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_searchsorted_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_select_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_exponential_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_signal_windows_kaiser_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_slice_scatter_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_ndtr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_polygamma_special_polygamma_n_0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_special_zeta_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_square_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_sum_to_size_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_t_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_take_along_dim_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_to_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_unsafe_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_view_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjp_xlogy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_H_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_NumpySortAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ScaleGradGenVmapAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rmod___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp___rpow___cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp__segment_reduce_offsets_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_acos_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_addr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_allclose_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_aminmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_argmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_as_strided_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_atan_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_baddbmm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cartesian_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cfloat_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_char_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_chunk_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clamp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_clamp_max_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_column_stack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_conj_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_dsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_expm1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_hfft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_ifftshift_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fft_irfftn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fill_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_fliplr_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_frexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_full_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_gradient_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_gt_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_igamma_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_add_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_index_reduce_prod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_int_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_isin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lerp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_lstsq_grad_oriented_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_matrix_rank_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_norm_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_linalg_tensorsolve_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_logaddexp_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_long_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_lu_unpack_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_cumprod_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_cumsum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_log_softmax_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_masked_sum_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_max_binary_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_max_pool2d_with_indices_backward_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mean_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_median_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_multinomial_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_mvlgamma_mvlgamma_p_3_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_avg_pool1d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_conv2d_strided_padding_dilation_no_bias_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_cosine_embedding_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_ctc_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_embedding_functorch_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_hardsigmoid_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_huber_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_interpolate_area_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_leaky_relu_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_pool2d_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_max_unpool3d_grad_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_soft_margin_loss_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softmin_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softmin_with_dtype_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_softsign_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_tanhshrink_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_threshold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_nn_functional_upsample_nearest_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_inf_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_norm_nuc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ones_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_ones_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_outer_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_quantile_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_rand_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_randn_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_repeat_interleave_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_short_functorch_no_channels_last_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_blackman_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_general_cosine_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_general_hamming_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_signal_windows_nuttall_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_bessel_y1_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_hermite_polynomial_he_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_scaled_modified_bessel_k0_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_special_shifted_chebyshev_polynomial_t_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_split_with_sizes_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_split_with_sizes_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_mean_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_std_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_stft_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tanh_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tensor_split_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_tensordot_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_trunc_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unbind_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unfold_copy_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unfold_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_unique_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_var_unbiased_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_view_as_complex_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_vsplit_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvjp_zeros_like_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyExpMarkDirtyAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_NumpyMulAutogradFunction_cuda_float32, test/functorch/test_ops.py::TestOperatorsCUDA::test_vmapvjpvmap_SortGenVmapAutogradFunction_cuda_float32 2024-08-08T21:40:35.8411722Z 2024-08-08T21:40:38.9662381Z Running test_ops 7/8 ... [2024-08-08 21:40:38.965613] 2024-08-08T21:40:38.9664575Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '-m', 'not serial', '--shard-id=7', '--num-shards=8', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:40:38.966021] 2024-08-08T21:45:03.5696141Z 2024-08-08T21:45:03.5697954Z test_ops 1/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_1.8_4e641004e02454ef_.log 2024-08-08T21:45:03.7504334Z Running 4144 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing___getitem___cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing__chunk_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_long_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mH_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nanmean_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_tanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes___radd___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_short_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_equal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_i0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pixel_unshuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_normal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addcmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_amax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_arange_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argsort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_partial_views_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cartesian_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_frac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_jiterator_unary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kron_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvalsh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_matrix_rank_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vander_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_matrix_exp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_binary_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_msort_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nanmean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_batch_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv_transpose2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_bicubic_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool3d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_qr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_quantile_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_short_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hann_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_laguerre_polynomial_l_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_to_sparse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_errors___radd___cuda, test/test_ops.py::TestCommonCUDA::test_errors___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_errors_dot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gather_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_errors_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_errors_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_errors_mean_cuda, test/test_ops.py::TestCommonCUDA::test_errors_median_cuda, test/test_ops.py::TestCommonCUDA::test_errors_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_pow_cuda, test/test_ops.py::TestCommonCUDA::test_errors_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_legendre_polynomial_p_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_tril_cuda, test/test_ops.py::TestCommonCUDA::test_errors_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_errors_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_errors_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_put_accumulate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argwhere_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_combinations_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cov_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_int_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_return_by_ref_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_h_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tile_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___getitem___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rand___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rxor___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__chunk_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values__unsafe_masked_index_put_accumulate_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_aminmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_argwhere_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_combinations_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_min_binary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nansum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_cosine_embedding_loss_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ones_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_outer_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_neg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_hermite_polynomial_h_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_combinations_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cov_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_einsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_histc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_inner_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nansum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_constant_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_silu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unique_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diagflat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_hamming_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tile_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_det_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logcumsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hann_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_all_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_count_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__softmax_backward_data_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmm_decomposed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_asin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cartesian_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_geqrf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isfinite_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_2inputs_2outputs_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigvals_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigvalsh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_matrix_rank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_pinv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_long_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_pool2d_with_indices_backward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_reduction_with_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nanquantile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_fractional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_grid_sample_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_logsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_scaled_dot_product_attention_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_outer_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_4_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_short_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_slice_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_y0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_torch__scaled_mm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_mean_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_mean_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_istft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_randn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_polar_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frexp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mish_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_mse_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pdist_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clone_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mH_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ne_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scalar_tensor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tile_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagflat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_det_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logical_not_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rpow___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argwhere_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_hfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_geqrf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_xor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_unpack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_i1e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_H_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_fftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_multi_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logcumsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_in_place_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_transpose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_eq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_kron_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_vsplit_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_short_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_alias_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_3d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_clone_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_column_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_conj_physical_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_dstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_item_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_lerp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_or_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_normal__in_place_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_rsqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sinc_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_trace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unsqueeze_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_var_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addbmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_physical_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_count_nonzero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diff_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_dstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_exp2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_put_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_istft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_kron_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_lerp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_triangular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logcumsumexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_circular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_reflect_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ravel_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resolve_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rsub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_multiple_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tile_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_triangular_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_true_divide_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__chunk_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_contiguous_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_eye_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_fft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_irfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_float_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_imag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isnan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logical_or_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rsub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unflatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addmm_decomposed_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_addr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_angle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_argwhere_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chunk_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_corrcoef_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_count_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_full_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_int_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_householder_product_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_ldl_factor_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_tensorsolve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_long_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_cumprod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_sum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_matrix_exp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_nuc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ones_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rand_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_randn_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rot90_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rsub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sinc_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_slice_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_long_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_copysign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diag_embed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_eye_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_hypot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_igamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_igammac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isposinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lgamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal__in_place_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_positive_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_randn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_round_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sgn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_signbit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_i1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_ndtri_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_tanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_true_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_where_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmm_decomposed_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_angle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_arange_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argsort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_asin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_broadcast_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cartesian_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_max_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_column_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_combinations_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_copysign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_eye_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_fft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_flipud_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_frac_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_frexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_item_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_kthvalue_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_le_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eigvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_inv_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_ldl_factor_ex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_hermitian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_slogdet_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logaddexp2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mT_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_pool2d_with_indices_backward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_reduction_no_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_reduction_no_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_native_batch_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_linear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_circular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_constant_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_relu6_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ormqr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_pca_lowrank_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randint_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_reshape_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resolve_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_sum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_bartlett_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_slice_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_j1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_u_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_laguerre_polynomial_l_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_ndtr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trapezoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unfold_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsafe_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_as_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_view_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zero__cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_aminmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atan2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_slogdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ne_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_new_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_select_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_floor_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_heaviside_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_hypot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isposinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_multinomial_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ne_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zeros_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___ror___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__chunk_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_aminmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_right_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bucketize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_kron_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resize__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_uniform_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unravel_index_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_uint8, test/test_ops.py::TestTagsCUDA::test_tags___rmod___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_complex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_double_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addcdiv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_all_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_any_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_and_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_broadcast_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_eq_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erfinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expand_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_rfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isreal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logaddexp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_positive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_real_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_round_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_select_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_transpose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_var_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_view_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcdiv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addmm_decomposed_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argsort_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argwhere_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_partial_views_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bool_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bucketize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cauchy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_copysign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_corrcoef_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dsplit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_rfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_flatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_int_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isreal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_item_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ldexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_lu_factor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_multi_dot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logdet_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_argmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_log_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_logaddexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_median_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_binary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanmean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nanquantile_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_full_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nextafter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_celu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pdist_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_silu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_threshold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pinverse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_qr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hamming_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_slice_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_bessel_y1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i0e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_log_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unravel_index_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_unsafe_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zeros_cuda_float32 2024-08-08T21:45:03.9117340Z 2024-08-08T21:45:06.8561067Z Running inductor/test_aot_inductor 3/16 ... [2024-08-08 21:45:06.855480] 2024-08-08T21:45:06.8563462Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=3', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:45:06.855851] 2024-08-08T21:49:12.7010985Z 2024-08-08T21:49:12.7012610Z test_ops 6/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_6.8_6aa93b7ba880065c_.log 2024-08-08T21:49:12.9032761Z Running 4023 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsafe_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_H_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_div_no_rounding_mode_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_fft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rand_like_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_slice_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_with_sizes_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_T_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rpow___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_bool_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_byte_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_int_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_abs_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log10_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_selu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_pow_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_multigammaln_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_var_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_lengths_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__softmax_backward_data_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addbmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addcdiv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmm_decomposed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clamp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cummin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_einsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_imag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_det_singular_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_inv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_multi_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mT_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_batch_norm_without_cudnn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardsigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardswish_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_linear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_local_response_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_multilabel_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_constant_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pad_replicate_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_silu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softsign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polar_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_airy_ai_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_shifted_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_list_args_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_topk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trapz_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unique_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsafe_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_errors___ror___cuda, test/test_ops.py::TestCommonCUDA::test_errors_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_empty_permuted_cuda, test/test_ops.py::TestCommonCUDA::test_errors_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_errors_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ge_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logspace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_errors_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_errors_neg_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_rrelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout1_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_errors_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_take_cuda, test/test_ops.py::TestCommonCUDA::test_errors_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_errors_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_multiple_devices_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cummin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_unary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mT_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_median_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_outer_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signbit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_all_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_any_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cummin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagflat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expand_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_half_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_put_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_softsign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nonzero_static_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize_as__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_slice_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_consecutive_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsqueeze_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmatmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rmul___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_copysign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gather_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_half_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kthvalue_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_mean_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tile_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_topk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vecdot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_gelu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_out___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__softmax_backward_data_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_count_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_einsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_kthvalue_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mT_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_native_dropout_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestCommonCUDA::test_out_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning_T_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_char_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_acosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_asinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_frexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_imag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_svd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_vecdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nan_to_num_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_mse_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_triplet_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal__in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_reshape_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_multigammaln_mvlgamma_p_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unfold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argwhere_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdist_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_eq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_equal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_erf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_ifftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isnan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_jiterator_4inputs_with_extra_args_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_le_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cholesky_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eig_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_triangular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_log2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_matmul_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_median_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_msort_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mvlgamma_mvlgamma_p_1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_binary_cross_entropy_with_logits_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_dropout2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_feature_alpha_dropout_without_train_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardtanh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_area_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_kl_div_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_circular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_rms_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_tanhshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_repeat_interleave_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_scatter_reduce_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_hermite_polynomial_he_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_k0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_multiple_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_std_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_to_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_zero__cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_var_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_right_shift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_constant_pad_nd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_T_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ihfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_rfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_ne_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_allclose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isfinite_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_elu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hinge_embedding_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_nll_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal__in_place_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_randn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stft_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_indices_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumprod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exponential_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cfloat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_max_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pairwise_distance_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_randn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_to_size_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_indices_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__chunk_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_einsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ifft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_full_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_geqrf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_put_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_slogdet_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_uniform_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cdouble_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_slogdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_norm_nuc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_ndtri_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trapz_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_unary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_kthvalue_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_matrix_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_celu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_normal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ones_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_select_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_triu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_ceil_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_count_nonzero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_dsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_floor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nanquantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_square_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_where_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_clamp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_igamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_det_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_repeat_interleave_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_bessel_y1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vsplit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___radd___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addcmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diff_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pinverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_airy_ai_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensor_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_as_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view__chunk_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_acos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_eye_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_hstack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_xor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mul_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sigmoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_to_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_tril_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_alias_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_clone_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_column_stack_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diagflat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_eq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_flip_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_inner_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_int_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cholesky_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_det_singular_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvalsh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_factor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_narrow_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_conv2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_fro_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reciprocal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_repeat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reshape_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scatter_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_list_args_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_transpose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trapezoid_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmul___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rpow___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_double_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_acosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_asin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_conj_physical_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expm1_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isfinite_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isreal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_meshgrid_variadic_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_permute_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_randn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_all_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cartesian_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_chalf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_char_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cholesky_inverse_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_column_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cov_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagflat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diff_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_inv_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lstsq_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_multi_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log10_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_cumsum_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pad_circular_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softsign_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_positive_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_pow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_svd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_triangular_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_var_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zero__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view_T_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___radd___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rsub___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_int_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_all_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_allclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_as_strided_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_asinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atan2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_atleast_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_count_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dsplit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_fftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_item_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lerp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_alpha_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rot90_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rsub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_select_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_entr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_zeta_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_t_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_trunc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__softmax_backward_data_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_alias_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_baddbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clamp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cosh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cummax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_hfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gradient_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_gt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_histc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_inner_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isneginf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_kron_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cond_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_pinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_solve_triangular_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_or_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_amax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_cumprod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nan_to_num_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanmedian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanquantile_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nansum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nextafter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_avg_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_gelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_kl_div_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_normalize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_rms_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_silu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_upsample_nearest_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_put_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randint_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_resolve_neg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_scatter_reduce_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_select_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_blackman_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i0e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_stft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_svd_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_take_along_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trapz_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_var_mean_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_where_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_argsort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_count_nonzero_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_norm_nuc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_randn_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bfloat16_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_pow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_i0e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_ndtr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_eq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_erfc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_det_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanquantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_repeat_interleave_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_searchsorted_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signbit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_j0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_entr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_triangular_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trunc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zero__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_asinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_shapes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_inv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lstsq_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_or_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mH_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_one_hot_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softsign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resolve_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_split_list_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_true_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unbind_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rand___cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags___rmul___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__batch_norm_with_update_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_addr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_alias_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_arange_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_as_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_block_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clamp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_clone_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_column_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_contiguous_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_copysign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_cosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_erf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exponential_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_eye_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fmod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_frac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isnan_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_masked_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_mul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nan_to_num_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_pow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_remainder_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_reshape_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_roll_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sgn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_entr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_ndtri_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sum_to_size_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_tensor_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_triu_indices_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_unbind_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__softmax_backward_data_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_abs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cartesian_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_char_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagflat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fliplr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_floor_divide_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_geometric_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_gradient_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_put_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isfinite_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isposinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_unary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_le_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_det_singular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_inv_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vector_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logical_not_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lu_unpack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_dropout_backward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_elu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_normal_number_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_pca_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_real_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_reciprocal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resize__cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hann_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_t_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_take_along_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tril_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unique_consecutive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_view_as_real_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_view_copy_cuda_float32 2024-08-08T21:49:13.0533011Z 2024-08-08T21:49:15.8533695Z Running inductor/test_aot_inductor 7/16 ... [2024-08-08 21:49:15.852753] 2024-08-08T21:49:15.8535816Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=7', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:49:15.853122] 2024-08-08T21:50:26.3532374Z 2024-08-08T21:50:26.3533798Z test_ops 7/8 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_7.8_285c7eb3b89094cc_.log 2024-08-08T21:50:26.5369753Z Running 4117 items in this shard: test/test_ops.py::TestCommonCUDA::test_compare_cpu___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_unsafe_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_as_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_conj_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_full_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_hsplit_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_sum_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes___getitem___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___ror___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__native_batch_norm_legit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_half_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_polar_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_conj_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_floor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_smooth_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ones_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_signbit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_logit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_std_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unflatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_where_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_byte_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cholesky_inverse_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clone_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cumulative_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_erf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fliplr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_reduce_prod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_int_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ldexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_triangular_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logdet_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_median_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_min_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_dropout_backward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_ctc_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_instance_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_area_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_unpool2d_grad_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_softplus_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_4_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_real_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_resize__cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_roll_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_slice_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_u_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_i0e_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_legendre_polynomial_p_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_scaled_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_xlog1py_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_std_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sub_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tanh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_torch_ops_aten__efficient_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_triangular_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vstack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zeros_like_cuda, test/test_ops.py::TestCommonCUDA::test_errors_amin_cuda, test/test_ops.py::TestCommonCUDA::test_errors_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_errors_cov_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diag_cuda, test/test_ops.py::TestCommonCUDA::test_errors_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_errors_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_errors_histogramdd_cuda, test/test_ops.py::TestCommonCUDA::test_errors_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_errors_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_errors_min_binary_cuda, test/test_ops.py::TestCommonCUDA::test_errors_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ne_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_embedding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_ormqr_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_gaussian_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_general_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout0_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_randn_like_layout0_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_zeros_like_layout2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sub_cuda, test/test_ops.py::TestCommonCUDA::test_multiple_devices_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___radd___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rsub___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_angle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_physical_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_corrcoef_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_permuted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_histc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_inner_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_kthvalue_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vander_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pairwise_distance_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nonzero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_airy_ai_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_split_list_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_std_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unique_consecutive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bitwise_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_byte_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagonal_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_rfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_full_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_unary_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_select_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_max_reduction_no_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_repeat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_reshape_as_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_rsqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_slice_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_y0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_chebyshev_polynomial_v_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_legendre_polynomial_p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_ndtri_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_scaled_modified_bessel_k0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_zero__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_zeros_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_H_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___getitem___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rxor___cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_partial_views_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_baddbmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cartesian_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ceil_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cov_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_hfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gradient_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_unary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_singular_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_reduction_with_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_median_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_min_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nanmean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_poisson_nll_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_outer_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_put_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rand_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_prod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sum_to_size_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diagflat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diagflat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diff_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_vander_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_rms_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_blackman_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_hamming_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out___rand___cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out__refs_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_permute_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_sin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_permuted_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_flipud_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_jiterator_unary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_qr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diagonal_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_inner_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nanmean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_fro_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_outer_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rsqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tensordot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_kaiser_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sqrt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_var_unbiased_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_warning__batch_norm_with_update_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_chalf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_int_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_polar_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atleast_2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_conj_physical_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_constant_pad_nd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_fftshift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isneginf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nextafter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_glu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_leaky_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_positive_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_randn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_i1e_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_logit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_multigammaln_mvlgamma_p_5_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_square_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unfold_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__segment_reduce_lengths_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_baddbmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bfloat16_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_char_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_clamp_min_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cumulative_trapezoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diag_embed_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_no_rounding_mode_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_einsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_grid_sampler_2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_histogramdd_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_int_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_isposinf_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lgamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_eigh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_factor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_pinv_singular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_vander_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logdet_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_matrix_exp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_max_reduction_no_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nanmedian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ne_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_adaptive_avg_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_nearest-exact_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_local_response_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_max_unpool1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_permute_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_polygamma_polygamma_n_3_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rad2deg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randint_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randint_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resolve_conj_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_searchsorted_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_general_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_y1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_polygamma_special_polygamma_n_0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_scaled_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tensordot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_topk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_trunc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_unique_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_cuda, test/test_ops.py::TestCommonCUDA::test_pointwise_tag_coverage_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_digamma_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_w_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcdiv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_full_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_alpha_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_std_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unflatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_hfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_geometric_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logaddexp_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_special_zeta_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_t_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_complex_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_half_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_short_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcmul_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cauchy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_trunc_rounding_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_float_power_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmod_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_frac_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gt_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_istft_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lcm_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svdvals_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_dropout_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_dropout_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_glu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_leaky_relu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softplus_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_tanhshrink_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rad2deg_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_ndtr_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_zeta_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_char_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_and_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cauchy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isclose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lerp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vecdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_mul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_leaky_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_roll_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_where_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_like_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_i0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lcm_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_maximum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_native_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_poisson_nll_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal__in_place_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_spherical_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_decomposed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_det_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_householder_product_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_matrix_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorsolve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matrix_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nan_to_num_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_interleave_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_interleave_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resolve_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_view_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_acosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_broadcast_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cartesian_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_flipud_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_index_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_det_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_vecdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_rad2deg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__softmax_backward_data_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_all_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ihfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lgamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_randint_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ravel_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rdiv___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad___rmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_div_floor_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_ifftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_frac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isposinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logaddexp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_randn_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_spherical_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator___rsub___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_bmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_deg2rad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_expand_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_histc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_inv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_long_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_native_layer_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_glu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_instance_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polar_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rand_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_rsqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_short_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_view_as_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_abs_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_acos_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_all_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_bucketize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_chalf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_digamma_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flip_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_float_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_frexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isinf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_pinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mH_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mT_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nextafter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_elu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_normal_number_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ones_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resize__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resolve_conj_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resolve_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sinc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_j1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_topk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trunc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_zeros_like_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view_H_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_char_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_half_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_allclose_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_constant_pad_nd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_fft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_float_power_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_imag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isreal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log1p_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_masked_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_new_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ravel_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_roll_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_rot90_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_sgn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_t_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unbind_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unfold_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_vsplit_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_abs_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_asin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atanh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_bfloat16_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_broadcast_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_chalf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cholesky_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_constant_pad_nd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cumsum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_div_no_rounding_mode_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expm1_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_fftshift_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_irfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_irfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_flatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_half_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_jiterator_unary_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_cond_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_tensorsolve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_vander_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_not_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_xor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_normalize_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_put_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_resize_as__cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_roll_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_svd_lowrank_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_take_along_dim_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_take_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tan_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_trapz_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unfold_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_zeros_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_H_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_cdouble_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs__conversions_float_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_abs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_column_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_exp2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_flip_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_isinf_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log1p_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_log2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_zeros_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_pow_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_real_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sgn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_transpose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_var_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_var_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_abs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_acosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_conj_physical_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dist_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_equal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expand_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifftshift_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_flip_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_float_power_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_full_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_gradient_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_half_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_inner_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isreal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_item_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_unary_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cholesky_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cholesky_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cond_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_slogdet_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_solve_ex_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vander_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mT_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_mean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_normalize_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nanmean_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_empty_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_linear_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pairwise_distance_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_static_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_normal_in_place_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_permute_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ravel_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reciprocal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_repeat_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_scatter_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sum_to_size_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_t_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_trace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unflatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view___getitem___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmod___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bool_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_float_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addcdiv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_arange_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_block_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_deg2rad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_eq_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_exp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_rfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_frexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_heaviside_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_i0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isreal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_le_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_matrix_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log10_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_mul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_narrow_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_empty_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_gelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_pow_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_i0e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_i1e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_take_along_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_trace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_var_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_vstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_add_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_all_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_allclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_char_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cholesky_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clone_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cov_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_div_no_rounding_mode_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expand_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expand_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ifftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_ihfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_flatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fliplr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_float_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_heaviside_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_i0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_put_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_ldexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lerp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lgamma_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eig_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eigvalsh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_ldl_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_tensorinv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logcumsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_not_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mH_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logsumexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_matmul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_reduction_with_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_minimum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_movedim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_multinomial_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nanmean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_cosine_similarity_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_embedding_bag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardsigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_instance_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool1d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_mish_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_reflect_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_rrelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_soft_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_threshold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_nuc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_normal_in_place_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_permute_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_positive_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_pow_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_round_decimals_0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_select_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sgn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_general_cosine_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_hann_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_kaiser_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_nuttall_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_v_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i1e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_modified_bessel_i1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_modified_bessel_k1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_xlog1py_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_list_args_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_stack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tensordot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_trace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_zeros_like_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___radd___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake___rmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__chunk_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_angle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_arange_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_bitwise_xor_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_char_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cross_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_rfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_index_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_jiterator_unary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nonzero_static_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_hann_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_airy_ai_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_bessel_j1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_zeta_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsqueeze_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_conj_physical_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addmv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_constant_pad_nd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cummin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_hsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_gelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_transpose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rdiv___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_acos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cos_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nan_to_num_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_neg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_relu6_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_svd_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_irfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_flipud_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fmod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logaddexp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_movedim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanmean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_embedding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_normal_number_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_randn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rot90_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_scalar_tensor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_slice_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_with_sizes_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_unique_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_allclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_column_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_contiguous_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_float_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_divide_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_frexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_hstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_imag_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_multinomial_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_bilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_pinverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_resize_as__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_roll_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_bessel_y1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_erfcx_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_zeta_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_along_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_tensor_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unique_consecutive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_zeros_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_bool, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_float64, test/test_ops.py::TestTagsCUDA::test_tags___rmatmul___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_bool_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_float_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_dstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_empty_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_exp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_expm1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_fft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_frexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_gt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_i0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_imag_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_add_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lgamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log10_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_or_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nextafter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ravel_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sigmoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_take_along_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_triu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_trunc_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_where_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_all_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_right_shift_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_ceil_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cholesky_inverse_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_combinations_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_conj_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_constant_pad_nd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cosh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diag_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagonal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_dot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_permuted_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_equal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_erf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_exp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_half_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_imag_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_istft_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_kron_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_pinv_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_solve_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logaddexp2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mH_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_maximum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_batch_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_layer_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_dropout2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_gelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nonzero_static_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_quantile_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randint_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ravel_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resolve_conj_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scalar_tensor_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sgn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_entr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i1e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_zeta_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_square_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_mean_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_unbiased_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tensor_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unsqueeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_var_unbiased_cuda_float32 2024-08-08T21:50:26.6921394Z 2024-08-08T21:50:29.5820611Z Running inductor/test_aot_inductor 9/16 ... [2024-08-08 21:50:29.581443] 2024-08-08T21:50:29.5823206Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=9', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 21:50:29.581855] 2024-08-08T22:08:50.2383159Z 2024-08-08T22:08:50.2384640Z PRINTING LOG FILE of inductor/test_aot_inductor 3/16 (test/test-reports/inductor.test_aot_inductor_3.16_0d0a4d0bf708ed2b_.log) 2024-08-08T22:08:50.2388309Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-9ee3e7e9abe6a3c3.xml 2024-08-08T22:08:50.2389643Z ============================= test session starts ============================== 2024-08-08T22:08:50.2390880Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:08:50.2391939Z cachedir: .pytest_cache 2024-08-08T22:08:50.2393128Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:08:50.2394257Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:08:50.2394794Z configfile: pytest.ini 2024-08-08T22:08:50.2395694Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:08:50.2396533Z collecting ... collected 912 items 2024-08-08T22:08:50.2397037Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T22:08:50.2443036Z Running 57 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aliased_buffer_reuse_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_convolution_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_smem_above_default_limit_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_freezing_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nested_tensor_from_jagged_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_dot_product_efficient_attention_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_split_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_view_outputs_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_buffers_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_missing_cubin_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pytree_inputs_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_large_grid_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_linear_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_default_cuda_device_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_2_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_quantized_linear_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_return_view_constant_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_sdpa_2_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_dynamic_shape_with_div_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_zero_grid_with_backed_symbols_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_amp_fallback_random_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_with_outer_code_before_after_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_embedding_bag_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fx_gm_return_tuple_validation_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_output_misaligned_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_repeated_user_defined_triton_kernel_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_runtime_checks_shape_failed_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_dynamic_shape_with_div_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_multi_output_arg_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_True_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_custom_op_add_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_fqn_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_nested_tensor_from_jagged_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_repeat_interleave_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_repeat_output_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_3_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_constant_folding_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fqn_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_sdpa_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_seq_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_while_loop_simple_non_abi_compatible_cuda 2024-08-08T22:08:50.2483887Z 2024-08-08T22:08:50.2485119Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aliased_buffer_reuse_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.4416s] [ 1%] 2024-08-08T22:08:50.2487022Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_convolution_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.6681s] [ 3%] 2024-08-08T22:08:50.2489271Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_smem_above_default_limit_abi_compatible_cpu SKIPPED [0.0003s] (Test was marked as expected failure, but does not fail always anymore.) [ 5%] 2024-08-08T22:08:50.2491560Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_freezing_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 7%] 2024-08-08T22:08:50.2493717Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nested_tensor_from_jagged_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.6313s] [ 8%] 2024-08-08T22:08:50.2495653Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_dot_product_efficient_attention_abi_compatible_cpu SKIPPED [0.0033s] (requires CUDA) [ 10%] 2024-08-08T22:08:50.2497494Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.7008s] [ 12%] 2024-08-08T22:08:50.2499171Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_split_abi_compatible_cpu PASSED [7.2376s] [ 14%] 2024-08-08T22:08:50.2500964Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 15%] 2024-08-08T22:08:50.2502998Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_view_outputs_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.1282s] [ 17%] 2024-08-08T22:08:50.2504931Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_buffers_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.5928s] [ 19%] 2024-08-08T22:08:50.2507153Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_missing_cubin_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 21%] 2024-08-08T22:08:50.2509587Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pytree_inputs_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 22%] 2024-08-08T22:08:50.2512242Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0031s] (requires CUDA) [ 24%] 2024-08-08T22:08:50.2514848Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 26%] 2024-08-08T22:08:50.2518361Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 28%] 2024-08-08T22:08:50.2521474Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_large_grid_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0027s] (requires CUDA) [ 29%] 2024-08-08T22:08:50.2524565Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_linear_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 31%] 2024-08-08T22:08:50.2527840Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_default_cuda_device_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0001s] (requires multiple cuda devices) [ 33%] 2024-08-08T22:08:50.2531039Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_2_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [15.3131s] [ 35%] 2024-08-08T22:08:50.2533879Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_quantized_linear_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 36%] 2024-08-08T22:08:50.2536753Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_return_view_constant_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [6.5465s] [ 38%] 2024-08-08T22:08:50.2539612Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 40%] 2024-08-08T22:08:50.2542420Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_sdpa_2_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.3559s] [ 42%] 2024-08-08T22:08:50.2545637Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_dynamic_shape_with_div_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 43%] 2024-08-08T22:08:50.2548894Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0026s] (requires CUDA) [ 45%] 2024-08-08T22:08:50.2552256Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_zero_grid_with_backed_symbols_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 47%] 2024-08-08T22:08:50.2554777Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_amp_fallback_random_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.4708s] [ 49%] 2024-08-08T22:08:50.2556776Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_with_outer_code_before_after_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [9.1613s] [ 50%] 2024-08-08T22:08:50.2558742Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_embedding_bag_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.3865s] [ 52%] 2024-08-08T22:08:50.2560752Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fx_gm_return_tuple_validation_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [0.0149s] [ 54%] 2024-08-08T22:08:50.2562701Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_output_misaligned_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.0487s] [ 56%] 2024-08-08T22:08:50.2564706Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_repeated_user_defined_triton_kernel_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.9697s] [ 57%] 2024-08-08T22:08:50.2567059Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_runtime_checks_shape_failed_abi_compatible_cuda <- test/inductor/test_torchinductor.py Error: input_handles[0]: unmatched dim value at 1, expected: 4, but got: 8 2024-08-08T22:08:50.2568337Z 2024-08-08T22:08:50.2568654Z Error: input_handles[0]: unmatched stride value at 1, expected: 4, but got: 1 2024-08-08T22:08:50.2569113Z 2024-08-08T22:08:50.2569495Z Error: input_handles[0]: dim value is too large at 0, expected to be <= 1024, but got: 2048 2024-08-08T22:08:50.2570016Z 2024-08-08T22:08:50.2570379Z Error: input_handles[0]: dim value is too large at 0, expected to be <= 1024, but got: 2048 2024-08-08T22:08:50.2570911Z 2024-08-08T22:08:50.2571119Z PASSED [5.5234s] [ 59%] 2024-08-08T22:08:50.2572456Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_dynamic_shape_with_div_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.1936s] [ 61%] 2024-08-08T22:08:50.2574464Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_abi_compatible_cuda PASSED [7.7996s] [ 63%] 2024-08-08T22:08:50.2576444Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_multi_output_arg_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.9895s] [ 64%] 2024-08-08T22:08:50.2578392Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cuda PASSED [7.9085s] [ 66%] 2024-08-08T22:08:50.2580255Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cuda PASSED [7.8160s] [ 68%] 2024-08-08T22:08:50.2582218Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cuda PASSED [8.7402s] [ 70%] 2024-08-08T22:08:50.2584994Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_True_non_abi_compatible_cpu W0808 21:48:07.818000 40360 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:08:50.2588086Z W0808 21:48:07.819000 40360 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:08:50.2589659Z PASSED [15.5994s] [ 71%] 2024-08-08T22:08:50.2590991Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_custom_op_add_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 73%] 2024-08-08T22:08:50.2593271Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.9116s] [ 75%] 2024-08-08T22:08:50.2595264Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_fqn_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.5174s] [ 77%] 2024-08-08T22:08:50.2597204Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_nested_tensor_from_jagged_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.9322s] [ 78%] 2024-08-08T22:08:50.2599214Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_repeat_interleave_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.6340s] [ 80%] 2024-08-08T22:08:50.2601297Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_repeat_output_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.1273s] [ 82%] 2024-08-08T22:08:50.2603452Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_non_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 84%] 2024-08-08T22:08:50.2605625Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_non_abi_compatible_cpu SKIPPED [0.0029s] (requires CUDA) [ 85%] 2024-08-08T22:08:50.2608129Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_non_abi_compatible_cpu SKIPPED [0.0029s] (requires CUDA) [ 87%] 2024-08-08T22:08:50.2610276Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_3_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [18.4921s] [ 89%] 2024-08-08T22:08:50.2612274Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_constant_folding_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.9882s] [ 91%] 2024-08-08T22:08:50.2614193Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fqn_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.0390s] [ 92%] 2024-08-08T22:08:50.2616543Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_sdpa_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.6136s] [ 94%] 2024-08-08T22:08:50.2618408Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_seq_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.5929s] [ 96%] 2024-08-08T22:08:50.2620346Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_non_abi_compatible_cuda PASSED [21.5235s] [ 98%] 2024-08-08T22:08:50.2622343Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_while_loop_simple_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.8798s] [100%] 2024-08-08T22:08:50.2623345Z 2024-08-08T22:08:50.2624246Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-9ee3e7e9abe6a3c3.xml - 2024-08-08T22:08:50.2625665Z ================== 36 passed, 21 skipped in 388.68s (0:06:28) ================== 2024-08-08T22:08:50.2626390Z Got exit code -11 (SIGSEGV) 2024-08-08T22:08:50.2626758Z Retrying single test... 2024-08-08T22:08:50.2627684Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-036e5e2aa2951390.xml 2024-08-08T22:08:50.2628755Z ============================= test session starts ============================== 2024-08-08T22:08:50.2629660Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:08:50.2630356Z cachedir: .pytest_cache 2024-08-08T22:08:50.2631214Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:08:50.2632171Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:08:50.2632576Z configfile: pytest.ini 2024-08-08T22:08:50.2633341Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:08:50.2634161Z collecting ... collected 912 items 2024-08-08T22:08:50.2635007Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T22:08:50.2635490Z Running 57 items in this shard 2024-08-08T22:08:50.2635746Z 2024-08-08T22:08:50.2636724Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aliased_buffer_reuse_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.4340s] [ 1%] 2024-08-08T22:08:50.2639737Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_convolution_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.6458s] [ 3%] 2024-08-08T22:08:50.2641900Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_smem_above_default_limit_abi_compatible_cpu SKIPPED [0.0003s] (Test was marked as expected failure, but does not fail always anymore.) [ 5%] 2024-08-08T22:08:50.2644081Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_freezing_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 7%] 2024-08-08T22:08:50.2646062Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nested_tensor_from_jagged_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.6422s] [ 8%] 2024-08-08T22:08:50.2648375Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_dot_product_efficient_attention_abi_compatible_cpu SKIPPED [0.0033s] (requires CUDA) [ 10%] 2024-08-08T22:08:50.2650410Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.6335s] [ 12%] 2024-08-08T22:08:50.2652226Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_split_abi_compatible_cpu PASSED [7.3525s] [ 14%] 2024-08-08T22:08:50.2654013Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0032s] (requires CUDA) [ 15%] 2024-08-08T22:08:50.2655978Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_view_outputs_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.1198s] [ 17%] 2024-08-08T22:08:50.2657916Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_buffers_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4448s] [ 19%] 2024-08-08T22:08:50.2660142Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_missing_cubin_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 21%] 2024-08-08T22:08:50.2662571Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pytree_inputs_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 22%] 2024-08-08T22:08:50.2665041Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0031s] (requires CUDA) [ 24%] 2024-08-08T22:08:50.2667608Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 26%] 2024-08-08T22:08:50.2670435Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 28%] 2024-08-08T22:08:50.2673701Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_large_grid_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0026s] (requires CUDA) [ 29%] 2024-08-08T22:08:50.2677129Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_linear_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 31%] 2024-08-08T22:08:50.2680169Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_default_cuda_device_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (requires multiple cuda devices) [ 33%] 2024-08-08T22:08:50.2683130Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_2_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [15.0330s] [ 35%] 2024-08-08T22:08:50.2685970Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_quantized_linear_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 36%] 2024-08-08T22:08:50.2688850Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_return_view_constant_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [6.5319s] [ 38%] 2024-08-08T22:08:50.2691818Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 40%] 2024-08-08T22:08:50.2694634Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_sdpa_2_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.3390s] [ 42%] 2024-08-08T22:08:50.2697782Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_dynamic_shape_with_div_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0031s] (requires CUDA) [ 43%] 2024-08-08T22:08:50.2701026Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0026s] (requires CUDA) [ 45%] 2024-08-08T22:08:50.2704214Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_zero_grid_with_backed_symbols_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 47%] 2024-08-08T22:08:50.2706771Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_amp_fallback_random_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.4475s] [ 49%] 2024-08-08T22:08:50.2708783Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_with_outer_code_before_after_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.9545s] [ 50%] 2024-08-08T22:08:50.2710765Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_embedding_bag_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.3800s] [ 52%] 2024-08-08T22:08:50.2713015Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fx_gm_return_tuple_validation_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [0.0148s] [ 54%] 2024-08-08T22:08:50.2714968Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_output_misaligned_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.9809s] [ 56%] 2024-08-08T22:08:50.2717670Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_repeated_user_defined_triton_kernel_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.0327s] [ 57%] 2024-08-08T22:08:50.2719892Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_runtime_checks_shape_failed_abi_compatible_cuda <- test/inductor/test_torchinductor.py Error: input_handles[0]: unmatched dim value at 1, expected: 4, but got: 8 2024-08-08T22:08:50.2721058Z 2024-08-08T22:08:50.2721376Z Error: input_handles[0]: unmatched stride value at 1, expected: 4, but got: 1 2024-08-08T22:08:50.2721839Z 2024-08-08T22:08:50.2722219Z Error: input_handles[0]: dim value is too large at 0, expected to be <= 1024, but got: 2048 2024-08-08T22:08:50.2722740Z 2024-08-08T22:08:50.2723105Z Error: input_handles[0]: dim value is too large at 0, expected to be <= 1024, but got: 2048 2024-08-08T22:08:50.2723628Z 2024-08-08T22:08:50.2723833Z PASSED [5.5350s] [ 59%] 2024-08-08T22:08:50.2725179Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_dynamic_shape_with_div_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.3931s] [ 61%] 2024-08-08T22:08:50.2727282Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_abi_compatible_cuda PASSED [7.9383s] [ 63%] 2024-08-08T22:08:50.2729244Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_multi_output_arg_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.1797s] [ 64%] 2024-08-08T22:08:50.2731096Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cuda PASSED [8.0636s] [ 66%] 2024-08-08T22:08:50.2732953Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cuda PASSED [7.9449s] [ 68%] 2024-08-08T22:08:50.2734918Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cuda PASSED [8.9146s] [ 70%] 2024-08-08T22:08:50.2737690Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_True_non_abi_compatible_cpu W0808 21:59:44.682000 44772 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:08:50.2740798Z W0808 21:59:44.683000 44772 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:08:50.2742312Z PASSED [15.5978s] [ 71%] 2024-08-08T22:08:50.2743642Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_custom_op_add_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 73%] 2024-08-08T22:08:50.2745752Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.7048s] [ 75%] 2024-08-08T22:08:50.2747864Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_fqn_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.4627s] [ 77%] 2024-08-08T22:08:50.2749812Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_nested_tensor_from_jagged_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.3794s] [ 78%] 2024-08-08T22:08:50.2752057Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_repeat_interleave_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.0709s] [ 80%] 2024-08-08T22:08:50.2754023Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_repeat_output_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.8131s] [ 82%] 2024-08-08T22:08:50.2756071Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_non_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 84%] 2024-08-08T22:08:50.2758224Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_non_abi_compatible_cpu SKIPPED [0.0029s] (requires CUDA) [ 85%] 2024-08-08T22:08:50.2760410Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_non_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 87%] 2024-08-08T22:08:50.2762493Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_3_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [18.2759s] [ 89%] 2024-08-08T22:08:50.2764562Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_constant_folding_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.2834s] [ 91%] 2024-08-08T22:08:50.2766498Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fqn_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.9464s] [ 92%] 2024-08-08T22:08:50.2768363Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_sdpa_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.5306s] [ 94%] 2024-08-08T22:08:50.2770242Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_seq_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.1714s] [ 96%] 2024-08-08T22:08:50.2772189Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_non_abi_compatible_cuda PASSED [21.4580s] [ 98%] 2024-08-08T22:08:50.2774191Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_while_loop_simple_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.9914s] [100%] 2024-08-08T22:08:50.2775185Z 2024-08-08T22:08:50.2776087Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-036e5e2aa2951390.xml - 2024-08-08T22:08:50.2777517Z ================== 36 passed, 21 skipped in 386.84s (0:06:26) ================== 2024-08-08T22:08:50.2778244Z Got exit code -11 (SIGSEGV) 2024-08-08T22:08:50.2778610Z Retrying single test... 2024-08-08T22:08:50.2779539Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-54badce2f3d68f90.xml 2024-08-08T22:08:50.2780618Z ============================= test session starts ============================== 2024-08-08T22:08:50.2781518Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:08:50.2782212Z cachedir: .pytest_cache 2024-08-08T22:08:50.2783164Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:08:50.2784006Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:08:50.2784408Z configfile: pytest.ini 2024-08-08T22:08:50.2785165Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:08:50.2786243Z collecting ... collected 912 items / 56 deselected / 856 selected 2024-08-08T22:08:50.2787494Z stepcurrent: skipping 56 already run items. Running only test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_while_loop_simple_non_abi_compatible_cuda 2024-08-08T22:08:50.2788574Z Running 1 items in this shard 2024-08-08T22:08:50.2788832Z 2024-08-08T22:08:50.2789840Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_while_loop_simple_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [17.8020s] [100%] 2024-08-08T22:08:50.2790849Z 2024-08-08T22:08:50.2791738Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-54badce2f3d68f90.xml - 2024-08-08T22:08:50.2793278Z ====================== 1 passed, 56 deselected in 17.88s ======================= 2024-08-08T22:08:50.2793942Z Got exit code 0 2024-08-08T22:08:50.2794415Z Test succeeeded in new process, continuing with the rest of the tests 2024-08-08T22:08:50.2795569Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-fc78df539eb126f2.xml 2024-08-08T22:08:50.2796720Z ============================= test session starts ============================== 2024-08-08T22:08:50.2797607Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:08:50.2798307Z cachedir: .pytest_cache 2024-08-08T22:08:50.2799192Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:08:50.2800063Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:08:50.2800467Z configfile: pytest.ini 2024-08-08T22:08:50.2801236Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:08:50.2802183Z collecting ... collected 912 items / 57 deselected / 855 selected 2024-08-08T22:08:50.2802787Z stepcurrent: skipping 57 already run items. 2024-08-08T22:08:50.2803242Z Running 0 items in this shard 2024-08-08T22:08:50.2803489Z 2024-08-08T22:08:50.2804367Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-fc78df539eb126f2.xml - 2024-08-08T22:08:50.2805708Z ============================ 57 deselected in 0.07s ============================ 2024-08-08T22:08:50.2807357Z The following tests failed and then succeeded when run in a new process['ul', 'test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_while_loop_simple_non_abi_compatible_cuda'] 2024-08-08T22:08:50.2808419Z 2024-08-08T22:08:50.2809191Z FINISHED PRINTING LOG FILE of inductor/test_aot_inductor 3/16 (test/test-reports/inductor.test_aot_inductor_3.16_0d0a4d0bf708ed2b_.log) 2024-08-08T22:08:50.2809934Z 2024-08-08T22:08:53.4057720Z Running inductor/test_compiled_optimizers 2/4 ... [2024-08-08 22:08:53.405059] 2024-08-08T22:08:53.4059777Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'not serial', '--shard-id=2', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 22:08:53.405427] 2024-08-08T22:14:00.6861562Z 2024-08-08T22:14:00.6863411Z PRINTING LOG FILE of inductor/test_aot_inductor 9/16 (test/test-reports/inductor.test_aot_inductor_9.16_f828ddeda7877015_.log) 2024-08-08T22:14:00.6866582Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-1b9316ebd13cce28.xml 2024-08-08T22:14:00.6869568Z ============================= test session starts ============================== 2024-08-08T22:14:00.6871120Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:00.6872854Z cachedir: .pytest_cache 2024-08-08T22:14:00.6874354Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:00.6875713Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:00.6876363Z configfile: pytest.ini 2024-08-08T22:14:00.6877680Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:00.6879034Z collecting ... collected 912 items 2024-08-08T22:14:00.6879833Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T22:14:00.6956899Z Running 64 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_reuse_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_conv_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_duplicated_params_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_cat_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_embedding_bag_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fqn_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_misaligned_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_simple_split_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_sympy_fn_like_arg_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_nested_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_consecutive_compiles_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_empty_graph_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_multi_device_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_quantized_linear_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_view_outputs_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_while_loop_with_outer_code_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_embedding_bag_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_empty_graph_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_default_cuda_device_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_tensor_input_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_shape_failed_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_seq_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_with_none_input_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_unsupported_input_dtype_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_amp_fallback_random_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_dup_unbacked_sym_decl_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_embedding_bag_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_foreach_multiple_dynamic_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fp8_view_of_param_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_index_put_with_none_index_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda 2024-08-08T22:14:00.7035588Z 2024-08-08T22:14:00.7037521Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.1349s] [ 1%] 2024-08-08T22:14:00.7040840Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [16.7056s] [ 3%] 2024-08-08T22:14:00.7044176Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 4%] 2024-08-08T22:14:00.7048128Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.5029s] [ 6%] 2024-08-08T22:14:00.7051335Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.3820s] [ 7%] 2024-08-08T22:14:00.7054472Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.5547s] [ 9%] 2024-08-08T22:14:00.7058372Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0808 21:51:23.083000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7061620Z W0808 21:51:23.096000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7063180Z W0808 21:51:37.543000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7064988Z W0808 21:51:37.555000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7066248Z PASSED [30.1138s] [ 10%] 2024-08-08T22:14:00.7068354Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.7535s] [ 12%] 2024-08-08T22:14:00.7072230Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0033s] (requires CUDA) [ 14%] 2024-08-08T22:14:00.7075973Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0045s] (requires CUDA) [ 15%] 2024-08-08T22:14:00.7079501Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0053s] (requires CUDA) [ 17%] 2024-08-08T22:14:00.7083151Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0044s] (requires CUDA) [ 18%] 2024-08-08T22:14:00.7087306Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0052s] (requires CUDA) [ 20%] 2024-08-08T22:14:00.7091499Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0003s] (Skipped!) [ 21%] 2024-08-08T22:14:00.7096674Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py In file included from /tmp/tmppkgvo9c1/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/cv3tq3jjutsu2bwk355cgldi64n3ej7utgfhokihcnt2ock3q5gf.cpp:5: 2024-08-08T22:14:00.7106637Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h: In instantiation of ‘void torch::aot_inductor::ThreadLocalCachedOutputTensor >::copy_data_from(const torch::aot_inductor::ArrayRefTensor&) [with T = float]’: 2024-08-08T22:14:00.7110552Z /tmp/tmppkgvo9c1/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/cv3tq3jjutsu2bwk355cgldi64n3ej7utgfhokihcnt2ock3q5gf.cpp:633:44: required from here 2024-08-08T22:14:00.7114846Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h:53:19: warning: comparison of integer expressions of different signedness: ‘long unsigned int’ and ‘int64_t’ {aka ‘long int’} [-Wsign-compare] 2024-08-08T22:14:00.7118002Z 53 | if (t.numel() > capacity_) { 2024-08-08T22:14:00.7118810Z PASSED [8.4078s] [ 23%] 2024-08-08T22:14:00.7121325Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [6.6598s] [ 25%] 2024-08-08T22:14:00.7125286Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 26%] 2024-08-08T22:14:00.7129308Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.4279s] [ 28%] 2024-08-08T22:14:00.7133560Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [14.4405s] [ 29%] 2024-08-08T22:14:00.7137336Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (Skipped!) [ 31%] 2024-08-08T22:14:00.7141471Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.2741s] [ 32%] 2024-08-08T22:14:00.7145532Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0032s] (requires CUDA) [ 34%] 2024-08-08T22:14:00.7149616Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 35%] 2024-08-08T22:14:00.7153900Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 37%] 2024-08-08T22:14:00.7158657Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0028s] (requires CUDA) [ 39%] 2024-08-08T22:14:00.7162722Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0032s] (requires CUDA) [ 40%] 2024-08-08T22:14:00.7167599Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 42%] 2024-08-08T22:14:00.7172872Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_reuse_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 43%] 2024-08-08T22:14:00.7178037Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_conv_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 45%] 2024-08-08T22:14:00.7183072Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_duplicated_params_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 46%] 2024-08-08T22:14:00.7189190Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_cat_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7194283Z from /tmp/tmp3k4e4qak/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/c3v33d7v7behktktydmmiqyyht4trpzqazpueon6gu7n36k7oxkf.cpp:1: 2024-08-08T22:14:00.7201658Z /tmp/tmp3k4e4qak/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/c3v33d7v7behktktydmmiqyyht4trpzqazpueon6gu7n36k7oxkf.cpp: In member function ‘Outputs torch::aot_inductor::AOTInductorModel::run_impl_minimal_arrayref_interface(const Inputs&, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle) [with Inputs = std::tuple, torch::aot_inductor::ArrayRefTensor >; Outputs = std::tuple >; torch::aot_inductor::DeviceStreamType = void*; AOTIProxyExecutorHandle = AOTIProxyExecutorOpaque*]’: 2024-08-08T22:14:00.7209215Z /tmp/tmp3k4e4qak/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/c3v33d7v7behktktydmmiqyyht4trpzqazpueon6gu7n36k7oxkf.cpp:549:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-08T22:14:00.7212005Z 549 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg0_1, &arg0_1_size)); 2024-08-08T22:14:00.7212997Z | ^~~~~~ 2024-08-08T22:14:00.7213761Z | | 2024-08-08T22:14:00.7214648Z | torch::aot_inductor::ArrayRefTensor 2024-08-08T22:14:00.7217335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-08T22:14:00.7219644Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-08T22:14:00.7220375Z | ^~~~ 2024-08-08T22:14:00.7221935Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-08T22:14:00.7224489Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7226953Z from /tmp/tmp3k4e4qak/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/c3v33d7v7behktktydmmiqyyht4trpzqazpueon6gu7n36k7oxkf.cpp:1: 2024-08-08T22:14:00.7230495Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-08T22:14:00.7232949Z 207 | AtenTensorHandle tensor, 2024-08-08T22:14:00.7233575Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-08T22:14:00.7235371Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7237914Z from /tmp/tmp3k4e4qak/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/c3v33d7v7behktktydmmiqyyht4trpzqazpueon6gu7n36k7oxkf.cpp:1: 2024-08-08T22:14:00.7241862Z /tmp/tmp3k4e4qak/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/c3v33d7v7behktktydmmiqyyht4trpzqazpueon6gu7n36k7oxkf.cpp:552:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-08T22:14:00.7244818Z 552 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg1_1, &arg1_1_size)); 2024-08-08T22:14:00.7245862Z | ^~~~~~ 2024-08-08T22:14:00.7246627Z | | 2024-08-08T22:14:00.7247475Z | torch::aot_inductor::ArrayRefTensor 2024-08-08T22:14:00.7249757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-08T22:14:00.7251760Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-08T22:14:00.7252891Z | ^~~~ 2024-08-08T22:14:00.7254486Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-08T22:14:00.7257109Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7259995Z from /tmp/tmp3k4e4qak/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/c3v33d7v7behktktydmmiqyyht4trpzqazpueon6gu7n36k7oxkf.cpp:1: 2024-08-08T22:14:00.7263636Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-08T22:14:00.7265885Z 207 | AtenTensorHandle tensor, 2024-08-08T22:14:00.7266517Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-08T22:14:00.7267244Z XFAIL [2.9669s] [ 48%] 2024-08-08T22:14:00.7270437Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_embedding_bag_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-08T22:14:00.7275550Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fqn_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 51%] 2024-08-08T22:14:00.7280889Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.3282s] [ 53%] 2024-08-08T22:14:00.7286424Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_misaligned_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 54%] 2024-08-08T22:14:00.7291508Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 56%] 2024-08-08T22:14:00.7296366Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_simple_split_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 57%] 2024-08-08T22:14:00.7301500Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_sympy_fn_like_arg_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0028s] (requires CUDA) [ 59%] 2024-08-08T22:14:00.7305889Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_nested_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [12.5862s] [ 60%] 2024-08-08T22:14:00.7309245Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_consecutive_compiles_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [17.1648s] [ 62%] 2024-08-08T22:14:00.7312723Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_empty_graph_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.1647s] [ 64%] 2024-08-08T22:14:00.7316576Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_multi_device_abi_compatible_cuda <- test/inductor/test_torchinductor.py W0808 21:53:33.543000 41931 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-08T22:14:00.7319236Z W0808 21:53:33.544000 41931 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-08T22:14:00.7320382Z PASSED [9.9648s] [ 65%] 2024-08-08T22:14:00.7322490Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_quantized_linear_abi_compatible_cuda SKIPPED [0.0002s] (Skipped!) [ 67%] 2024-08-08T22:14:00.7325215Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cuda PASSED [9.0824s] [ 68%] 2024-08-08T22:14:00.7328450Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_abi_compatible_cuda PASSED [12.9099s] [ 70%] 2024-08-08T22:14:00.7331780Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cuda PASSED [8.1479s] [ 71%] 2024-08-08T22:14:00.7334954Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_view_outputs_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.7251s] [ 73%] 2024-08-08T22:14:00.7338006Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_while_loop_with_outer_code_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.7723s] [ 75%] 2024-08-08T22:14:00.7341117Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_embedding_bag_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.4402s] [ 76%] 2024-08-08T22:14:00.7344215Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_empty_graph_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.6304s] [ 78%] 2024-08-08T22:14:00.7347166Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_default_cuda_device_non_abi_compatible_cpu SKIPPED [0.0002s] (requires multiple cuda devices) [ 79%] 2024-08-08T22:14:00.7350677Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_tensor_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0808 21:54:59.188000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7353490Z W0808 21:54:59.200000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7354951Z W0808 21:55:13.513000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7356378Z W0808 21:55:13.525000 41931 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7357483Z PASSED [29.7197s] [ 81%] 2024-08-08T22:14:00.7359690Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_shape_failed_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 82%] 2024-08-08T22:14:00.7363932Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_seq_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py In file included from /tmp/tmpyr5brbvj/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/cowqmvr2pc5wfetqljen2uu3ljsx5atqy6c5gc5rwbqsv63lxfy2.cpp:463: 2024-08-08T22:14:00.7368530Z /tmp/tmpyr5brbvj/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h: In instantiation of ‘Welford welford_combine(const Welford&, T, const WeightRecp*) [with T = at::vec::CPU_CAPABILITY::Vectorized]’: 2024-08-08T22:14:00.7371713Z /tmp/tmpyr5brbvj/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/cowqmvr2pc5wfetqljen2uu3ljsx5atqy6c5gc5rwbqsv63lxfy2.cpp:482:81: required from here 2024-08-08T22:14:00.7375820Z /tmp/tmpyr5brbvj/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h:107:35: warning: comparison of integer expressions of different signedness: ‘const int64_t’ {aka ‘const long int’} and ‘std::vector >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare] 2024-08-08T22:14:00.7378896Z 107 | ((w == nullptr || acc.index >= w->weight_recps.size()) 2024-08-08T22:14:00.7379751Z PASSED [15.7361s] [ 84%] 2024-08-08T22:14:00.7382064Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_non_abi_compatible_cpu SKIPPED [0.0032s] (requires CUDA) [ 85%] 2024-08-08T22:14:00.7385325Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_with_none_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (requires CUDA) [ 87%] 2024-08-08T22:14:00.7388306Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_unsupported_input_dtype_non_abi_compatible_cpu PASSED [0.0410s] [ 89%] 2024-08-08T22:14:00.7391296Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_amp_fallback_random_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.5220s] [ 90%] 2024-08-08T22:14:00.7394694Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_dup_unbacked_sym_decl_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.8378s] [ 92%] 2024-08-08T22:14:00.7397918Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_embedding_bag_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.8664s] [ 93%] 2024-08-08T22:14:00.7401204Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_foreach_multiple_dynamic_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.5208s] [ 95%] 2024-08-08T22:14:00.7404422Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fp8_view_of_param_non_abi_compatible_cuda SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 96%] 2024-08-08T22:14:00.7407640Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_index_put_with_none_index_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.0630s] [ 98%] 2024-08-08T22:14:00.7410886Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda SKIPPED [0.0002s] (requires multiple cuda devices) [100%] 2024-08-08T22:14:00.7412511Z 2024-08-08T22:14:00.7413948Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-1b9316ebd13cce28.xml - 2024-08-08T22:14:00.7416743Z ============ 32 passed, 31 skipped, 1 xfailed in 390.82s (0:06:30) ============= 2024-08-08T22:14:00.7417965Z Got exit code -11 (SIGSEGV) 2024-08-08T22:14:00.7418505Z Retrying single test... 2024-08-08T22:14:00.7419984Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-f60bbe16c1ada337.xml 2024-08-08T22:14:00.7421709Z ============================= test session starts ============================== 2024-08-08T22:14:00.7423132Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:00.7424147Z cachedir: .pytest_cache 2024-08-08T22:14:00.7425411Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:00.7426698Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:00.7427289Z configfile: pytest.ini 2024-08-08T22:14:00.7428412Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:00.7429682Z collecting ... collected 912 items 2024-08-08T22:14:00.7430762Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T22:14:00.7431497Z Running 64 items in this shard 2024-08-08T22:14:00.7431872Z 2024-08-08T22:14:00.7433494Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.0551s] [ 1%] 2024-08-08T22:14:00.7436799Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [16.9246s] [ 3%] 2024-08-08T22:14:00.7439937Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 4%] 2024-08-08T22:14:00.7443178Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.7015s] [ 6%] 2024-08-08T22:14:00.7446281Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.3257s] [ 7%] 2024-08-08T22:14:00.7449312Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.6124s] [ 9%] 2024-08-08T22:14:00.7452729Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0808 22:03:02.140000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7455375Z W0808 22:03:02.153000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7456954Z W0808 22:03:16.481000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7458524Z W0808 22:03:16.493000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7459661Z PASSED [29.5027s] [ 10%] 2024-08-08T22:14:00.7461681Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.6395s] [ 12%] 2024-08-08T22:14:00.7464970Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 14%] 2024-08-08T22:14:00.7468354Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 15%] 2024-08-08T22:14:00.7471382Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu SKIPPED [0.0032s] (requires CUDA) [ 17%] 2024-08-08T22:14:00.7474690Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0031s] (requires CUDA) [ 18%] 2024-08-08T22:14:00.7477768Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0031s] (requires CUDA) [ 20%] 2024-08-08T22:14:00.7481449Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (Skipped!) [ 21%] 2024-08-08T22:14:00.7486389Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py In file included from /tmp/tmppap3cfr7/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/ce242fqygzreaae2okmdyyy4tqhzwvozdnxqdbss2znqti6tdqg3.cpp:5: 2024-08-08T22:14:00.7492420Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h: In instantiation of ‘void torch::aot_inductor::ThreadLocalCachedOutputTensor >::copy_data_from(const torch::aot_inductor::ArrayRefTensor&) [with T = float]’: 2024-08-08T22:14:00.7496396Z /tmp/tmppap3cfr7/c7i5lqgiqlmv4idwxsjrqmsvd46uj5fkzsspijrpot2vur3xj2nr/ce242fqygzreaae2okmdyyy4tqhzwvozdnxqdbss2znqti6tdqg3.cpp:633:44: required from here 2024-08-08T22:14:00.7500326Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/thread_local.h:53:19: warning: comparison of integer expressions of different signedness: ‘long unsigned int’ and ‘int64_t’ {aka ‘long int’} [-Wsign-compare] 2024-08-08T22:14:00.7502811Z 53 | if (t.numel() > capacity_) { 2024-08-08T22:14:00.7503624Z PASSED [7.7135s] [ 23%] 2024-08-08T22:14:00.7505971Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [6.6548s] [ 25%] 2024-08-08T22:14:00.7509815Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 26%] 2024-08-08T22:14:00.7514188Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.4105s] [ 28%] 2024-08-08T22:14:00.7518562Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [14.5961s] [ 29%] 2024-08-08T22:14:00.7522213Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (Skipped!) [ 31%] 2024-08-08T22:14:00.7525647Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.2671s] [ 32%] 2024-08-08T22:14:00.7529574Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0031s] (requires CUDA) [ 34%] 2024-08-08T22:14:00.7539972Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 35%] 2024-08-08T22:14:00.7544078Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 37%] 2024-08-08T22:14:00.7548132Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0029s] (requires CUDA) [ 39%] 2024-08-08T22:14:00.7552292Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0033s] (requires CUDA) [ 40%] 2024-08-08T22:14:00.7557042Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 42%] 2024-08-08T22:14:00.7562018Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_buffer_reuse_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 43%] 2024-08-08T22:14:00.7566877Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_conv_freezing_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 45%] 2024-08-08T22:14:00.7572014Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_duplicated_params_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 46%] 2024-08-08T22:14:00.7577395Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_cat_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7581010Z from /tmp/tmpd_14d4wl/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cs6eyyzrxpy2kdmyyk6qw57uuwmhg5fyio5oxu4kwlfxgnz5best.cpp:1: 2024-08-08T22:14:00.7587294Z /tmp/tmpd_14d4wl/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cs6eyyzrxpy2kdmyyk6qw57uuwmhg5fyio5oxu4kwlfxgnz5best.cpp: In member function ‘Outputs torch::aot_inductor::AOTInductorModel::run_impl_minimal_arrayref_interface(const Inputs&, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle) [with Inputs = std::tuple, torch::aot_inductor::ArrayRefTensor >; Outputs = std::tuple >; torch::aot_inductor::DeviceStreamType = void*; AOTIProxyExecutorHandle = AOTIProxyExecutorOpaque*]’: 2024-08-08T22:14:00.7594004Z /tmp/tmpd_14d4wl/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cs6eyyzrxpy2kdmyyk6qw57uuwmhg5fyio5oxu4kwlfxgnz5best.cpp:549:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-08T22:14:00.7596519Z 549 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg0_1, &arg0_1_size)); 2024-08-08T22:14:00.7597422Z | ^~~~~~ 2024-08-08T22:14:00.7598124Z | | 2024-08-08T22:14:00.7598888Z | torch::aot_inductor::ArrayRefTensor 2024-08-08T22:14:00.7601199Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-08T22:14:00.7603061Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-08T22:14:00.7603749Z | ^~~~ 2024-08-08T22:14:00.7605198Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-08T22:14:00.7607384Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7609593Z from /tmp/tmpd_14d4wl/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cs6eyyzrxpy2kdmyyk6qw57uuwmhg5fyio5oxu4kwlfxgnz5best.cpp:1: 2024-08-08T22:14:00.7613098Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-08T22:14:00.7615671Z 207 | AtenTensorHandle tensor, 2024-08-08T22:14:00.7616289Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-08T22:14:00.7618219Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7620598Z from /tmp/tmpd_14d4wl/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cs6eyyzrxpy2kdmyyk6qw57uuwmhg5fyio5oxu4kwlfxgnz5best.cpp:1: 2024-08-08T22:14:00.7624174Z /tmp/tmpd_14d4wl/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cs6eyyzrxpy2kdmyyk6qw57uuwmhg5fyio5oxu4kwlfxgnz5best.cpp:552:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’} 2024-08-08T22:14:00.7626796Z 552 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg1_1, &arg1_1_size)); 2024-08-08T22:14:00.7627789Z | ^~~~~~ 2024-08-08T22:14:00.7628534Z | | 2024-08-08T22:14:00.7629343Z | torch::aot_inductor::ArrayRefTensor 2024-08-08T22:14:00.7631623Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:34:8: note: in definition of macro ‘AOTI_TORCH_ERROR_CODE_CHECK’ 2024-08-08T22:14:00.7633868Z 34 | if ((call) != AOTI_TORCH_SUCCESS) { \ 2024-08-08T22:14:00.7634602Z | ^~~~ 2024-08-08T22:14:00.7636198Z In file included from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/utils.h:14, 2024-08-08T22:14:00.7638722Z from /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3, 2024-08-08T22:14:00.7641115Z from /tmp/tmpd_14d4wl/c3j2fj4ptr2nawvilbojufkbbrvq44u2mvv52lj4hpjwwurgbd62/cs6eyyzrxpy2kdmyyk6qw57uuwmhg5fyio5oxu4kwlfxgnz5best.cpp:1: 2024-08-08T22:14:00.7644455Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/inductor/aoti_torch/c/shim.h:207:22: note: initializing argument 1 of ‘AOTITorchError aoti_torch_get_sizes(AtenTensorHandle, int64_t**)’ 2024-08-08T22:14:00.7646520Z 207 | AtenTensorHandle tensor, 2024-08-08T22:14:00.7647130Z | ~~~~~~~~~~~~~~~~~^~~~~~ 2024-08-08T22:14:00.7647809Z XFAIL [2.9660s] [ 48%] 2024-08-08T22:14:00.7650545Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_embedding_bag_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-08T22:14:00.7655564Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fqn_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 51%] 2024-08-08T22:14:00.7660231Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_contiguous_output_alias_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [8.0205s] [ 53%] 2024-08-08T22:14:00.7664259Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_misaligned_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 54%] 2024-08-08T22:14:00.7668544Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_output_path_1_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 56%] 2024-08-08T22:14:00.7672208Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_simple_split_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0001s] (Skipped!) [ 57%] 2024-08-08T22:14:00.7675430Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_sympy_fn_like_arg_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0028s] (requires CUDA) [ 59%] 2024-08-08T22:14:00.7677973Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_nested_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [12.6098s] [ 60%] 2024-08-08T22:14:00.7679901Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_consecutive_compiles_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.9970s] [ 62%] 2024-08-08T22:14:00.7681813Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_empty_graph_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.1373s] [ 64%] 2024-08-08T22:14:00.7683930Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_multi_device_abi_compatible_cuda <- test/inductor/test_torchinductor.py W0808 22:05:11.826000 46224 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-08T22:14:00.7685541Z W0808 22:05:11.827000 46224 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-08T22:14:00.7686255Z PASSED [10.1231s] [ 65%] 2024-08-08T22:14:00.7687377Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_quantized_linear_abi_compatible_cuda SKIPPED [0.0002s] (Skipped!) [ 67%] 2024-08-08T22:14:00.7689197Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cuda PASSED [9.0654s] [ 68%] 2024-08-08T22:14:00.7691122Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_abi_compatible_cuda PASSED [12.8672s] [ 70%] 2024-08-08T22:14:00.7693053Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cuda PASSED [8.1153s] [ 71%] 2024-08-08T22:14:00.7694949Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_view_outputs_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [7.7364s] [ 73%] 2024-08-08T22:14:00.7696952Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_while_loop_with_outer_code_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [8.9681s] [ 75%] 2024-08-08T22:14:00.7698929Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_embedding_bag_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.3525s] [ 76%] 2024-08-08T22:14:00.7700855Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_empty_graph_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [14.4035s] [ 78%] 2024-08-08T22:14:00.7702803Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_default_cuda_device_non_abi_compatible_cpu SKIPPED [0.0002s] (requires multiple cuda devices) [ 79%] 2024-08-08T22:14:00.7705074Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_non_tensor_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0808 22:06:37.427000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7706820Z W0808 22:06:37.440000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7707806Z W0808 22:06:51.885000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7708841Z W0808 22:06:51.897000 46224 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:00.7709621Z PASSED [29.9997s] [ 81%] 2024-08-08T22:14:00.7711023Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_shape_failed_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 82%] 2024-08-08T22:14:00.7713908Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_seq_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py In file included from /tmp/tmph5rqrwob/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/czc6i5fygbxhrnic56uy4gwh4jdr2ar4hiiioqdx7pbpueewe6ij.cpp:463: 2024-08-08T22:14:00.7717318Z /tmp/tmph5rqrwob/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h: In instantiation of ‘Welford welford_combine(const Welford&, T, const WeightRecp*) [with T = at::vec::CPU_CAPABILITY::Vectorized]’: 2024-08-08T22:14:00.7719338Z /tmp/tmph5rqrwob/cpt2qkjbtlwf4o6h7sbqkwf5eqnfvfdcbvk6sqw3yvecy7cwtkrh/czc6i5fygbxhrnic56uy4gwh4jdr2ar4hiiioqdx7pbpueewe6ij.cpp:482:81: required from here 2024-08-08T22:14:00.7721948Z /tmp/tmph5rqrwob/kn/cknvu7lkosis6uvucfo2ku5c3mihl5oivnryjnqptmxl6vmqywnv.h:107:35: warning: comparison of integer expressions of different signedness: ‘const int64_t’ {aka ‘const long int’} and ‘std::vector >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare] 2024-08-08T22:14:00.7723959Z 107 | ((w == nullptr || acc.index >= w->weight_recps.size()) 2024-08-08T22:14:00.7724550Z PASSED [15.7020s] [ 84%] 2024-08-08T22:14:00.7725952Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_non_abi_compatible_cpu SKIPPED [0.0033s] (requires CUDA) [ 85%] 2024-08-08T22:14:00.7728148Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_with_none_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0031s] (requires CUDA) [ 87%] 2024-08-08T22:14:00.7730068Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_unsupported_input_dtype_non_abi_compatible_cpu PASSED [0.0400s] [ 89%] 2024-08-08T22:14:00.7731912Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_amp_fallback_random_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.5660s] [ 90%] 2024-08-08T22:14:00.7733929Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_dup_unbacked_sym_decl_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4842s] [ 92%] 2024-08-08T22:14:00.7736042Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_embedding_bag_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.2092s] [ 93%] 2024-08-08T22:14:00.7738106Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_foreach_multiple_dynamic_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4199s] [ 95%] 2024-08-08T22:14:00.7740099Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_fp8_view_of_param_non_abi_compatible_cuda SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 96%] 2024-08-08T22:14:00.7742085Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_index_put_with_none_index_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.6410s] [ 98%] 2024-08-08T22:14:00.7744199Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda SKIPPED [0.0002s] (requires multiple cuda devices) [100%] 2024-08-08T22:14:00.7745186Z 2024-08-08T22:14:00.7746159Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-f60bbe16c1ada337.xml - 2024-08-08T22:14:00.7747655Z ============ 32 passed, 31 skipped, 1 xfailed in 390.08s (0:06:30) ============= 2024-08-08T22:14:00.7748438Z Got exit code -11 (SIGSEGV) 2024-08-08T22:14:00.7748801Z Retrying single test... 2024-08-08T22:14:00.7749739Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-29f6d08ef7270379.xml 2024-08-08T22:14:00.7750801Z ============================= test session starts ============================== 2024-08-08T22:14:00.7751710Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:00.7752602Z cachedir: .pytest_cache 2024-08-08T22:14:00.7753476Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:00.7754308Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:00.7754712Z configfile: pytest.ini 2024-08-08T22:14:00.7755484Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:00.7756510Z collecting ... collected 912 items / 63 deselected / 849 selected 2024-08-08T22:14:00.7757791Z stepcurrent: skipping 63 already run items. Running only test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda 2024-08-08T22:14:00.7758904Z Running 1 items in this shard 2024-08-08T22:14:00.7759156Z 2024-08-08T22:14:00.7760158Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda SKIPPED [0.0002s] (requires multiple cuda devices) [100%] 2024-08-08T22:14:00.7761145Z 2024-08-08T22:14:00.7762031Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-29f6d08ef7270379.xml - 2024-08-08T22:14:00.7763453Z ====================== 1 skipped, 63 deselected in 0.08s ======================= 2024-08-08T22:14:00.7764134Z Got exit code 0 2024-08-08T22:14:00.7764609Z Test succeeeded in new process, continuing with the rest of the tests 2024-08-08T22:14:00.7765760Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-591d916419a4856e.xml 2024-08-08T22:14:00.7766831Z ============================= test session starts ============================== 2024-08-08T22:14:00.7767737Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:00.7768487Z cachedir: .pytest_cache 2024-08-08T22:14:00.7769356Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:00.7770188Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:00.7770585Z configfile: pytest.ini 2024-08-08T22:14:00.7771354Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:00.7772311Z collecting ... collected 912 items / 64 deselected / 848 selected 2024-08-08T22:14:00.7772904Z stepcurrent: skipping 64 already run items. 2024-08-08T22:14:00.7773361Z Running 0 items in this shard 2024-08-08T22:14:00.7773617Z 2024-08-08T22:14:00.7774491Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-591d916419a4856e.xml - 2024-08-08T22:14:00.7775895Z ============================ 64 deselected in 0.07s ============================ 2024-08-08T22:14:00.7777555Z The following tests failed and then succeeded when run in a new process['ul', 'test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_non_default_cuda_device_non_abi_compatible_cuda'] 2024-08-08T22:14:00.7778688Z 2024-08-08T22:14:00.7779406Z FINISHED PRINTING LOG FILE of inductor/test_aot_inductor 9/16 (test/test-reports/inductor.test_aot_inductor_9.16_f828ddeda7877015_.log) 2024-08-08T22:14:00.7780160Z 2024-08-08T22:14:04.0177541Z Running inductor/test_compiled_optimizers 3/4 ... [2024-08-08 22:14:04.017109] 2024-08-08T22:14:04.0180338Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'not serial', '--shard-id=3', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 22:14:04.017515] 2024-08-08T22:14:17.9442290Z 2024-08-08T22:14:17.9444691Z PRINTING LOG FILE of inductor/test_aot_inductor 7/16 (test/test-reports/inductor.test_aot_inductor_7.16_456039372a27472d_.log) 2024-08-08T22:14:17.9446919Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-ef9481da91af52d2.xml 2024-08-08T22:14:17.9448538Z ============================= test session starts ============================== 2024-08-08T22:14:17.9449886Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:17.9451216Z cachedir: .pytest_cache 2024-08-08T22:14:17.9452528Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:17.9453732Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:17.9454299Z configfile: pytest.ini 2024-08-08T22:14:17.9455433Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:17.9456642Z collecting ... collected 912 items 2024-08-08T22:14:17.9457376Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T22:14:17.9519074Z Running 58 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cpu_with_stack_allocation, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_original_fqn_and_dtype_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_smem_above_default_limit_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fx_gm_return_tuple_validation_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_missing_output_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_tensor_input_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_view_outputs_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_while_loop_simple_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_dynamic_smem_above_default_limit_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fake_tensor_device_validation_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_missing_cubin_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_zero_size_weight_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_buffer_mutation_4_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_int_list_input_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_misc_1_max_autotune_True_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_model_modified_weights_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_output_path_1_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_quantized_linear_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_reuse_kernel_dynamic_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_fp8_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_zero_size_weight_non_abi_compatible_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_2_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_int_list_input_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_nested_tensor_from_jagged_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_scatter_reduce_fallback_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_equal_to_1_float_arg_dynamic_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_non_abi_compatible_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda 2024-08-08T22:14:17.9579509Z 2024-08-08T22:14:17.9581193Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.0797s] [ 1%] 2024-08-08T22:14:17.9584523Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.8459s] [ 3%] 2024-08-08T22:14:17.9587462Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.5553s] [ 5%] 2024-08-08T22:14:17.9590327Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4136s] [ 6%] 2024-08-08T22:14:17.9593342Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4254s] [ 8%] 2024-08-08T22:14:17.9596195Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0031s] (requires CUDA) [ 10%] 2024-08-08T22:14:17.9599555Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0808 21:50:01.331000 41421 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-08T22:14:17.9601678Z PASSED [10.2162s] [ 12%] 2024-08-08T22:14:17.9603492Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.5263s] [ 13%] 2024-08-08T22:14:17.9606300Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.6992s] [ 15%] 2024-08-08T22:14:17.9609123Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4628s] [ 17%] 2024-08-08T22:14:17.9611943Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu SKIPPED [0.0046s] (requires CUDA) [ 18%] 2024-08-08T22:14:17.9614875Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0044s] (requires CUDA) [ 20%] 2024-08-08T22:14:17.9618066Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cpu SKIPPED [0.0100s] (requires CUDA) [ 22%] 2024-08-08T22:14:17.9621476Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.4430s] [ 24%] 2024-08-08T22:14:17.9625002Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 25%] 2024-08-08T22:14:17.9628517Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [11.2085s] [ 27%] 2024-08-08T22:14:17.9631936Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [8.5064s] [ 29%] 2024-08-08T22:14:17.9635441Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 31%] 2024-08-08T22:14:17.9639192Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 32%] 2024-08-08T22:14:17.9642848Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 34%] 2024-08-08T22:14:17.9646256Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0001s] (Skipped!) [ 36%] 2024-08-08T22:14:17.9650460Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0032s] (requires CUDA) [ 37%] 2024-08-08T22:14:17.9654517Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (requires CUDA) [ 39%] 2024-08-08T22:14:17.9658366Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0034s] (requires CUDA) [ 41%] 2024-08-08T22:14:17.9662272Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 43%] 2024-08-08T22:14:17.9666718Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_original_fqn_and_dtype_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [15.5179s] [ 44%] 2024-08-08T22:14:17.9671411Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_smem_above_default_limit_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 46%] 2024-08-08T22:14:17.9676105Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fx_gm_return_tuple_validation_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [0.0129s] [ 48%] 2024-08-08T22:14:17.9680632Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_missing_output_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-08T22:14:17.9685376Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_tensor_input_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py W0808 21:51:15.944000 41421 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:17.9688377Z W0808 21:51:15.957000 41421 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:17.9689757Z W0808 21:51:30.203000 41421 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:17.9691050Z W0808 21:51:30.216000 41421 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:17.9692131Z PASSED [29.6150s] [ 51%] 2024-08-08T22:14:17.9694944Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0030s] (requires CUDA) [ 53%] 2024-08-08T22:14:17.9699358Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0027s] (requires CUDA) [ 55%] 2024-08-08T22:14:17.9703554Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_view_outputs_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.3732s] [ 56%] 2024-08-08T22:14:17.9707785Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_while_loop_simple_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 58%] 2024-08-08T22:14:17.9712527Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cuda W0808 21:51:52.950000 41421 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:14:17.9717601Z W0808 21:51:52.958000 41421 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:14:17.9719664Z PASSED [8.6544s] [ 60%] 2024-08-08T22:14:17.9721728Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_dynamic_smem_above_default_limit_abi_compatible_cuda SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 62%] 2024-08-08T22:14:17.9724784Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fake_tensor_device_validation_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [0.0370s] [ 63%] 2024-08-08T22:14:17.9727564Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_missing_cubin_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [28.9146s] [ 65%] 2024-08-08T22:14:17.9732715Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_zero_size_weight_abi_compatible_cuda <- test/inductor/test_torchinductor.py /tmp/tmpwd_q5vvb/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/c4l6qijdgpfhu64w7s74uh754y7pslgzx7wbvlqtbrqeuvyzlmuh.cpp: In member function ‘void torch::aot_inductor::AOTInductorModel::run_impl(AtenTensorOpaque**, AtenTensorOpaque**, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle)’: 2024-08-08T22:14:17.9741271Z /tmp/tmpwd_q5vvb/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/c4l6qijdgpfhu64w7s74uh754y7pslgzx7wbvlqtbrqeuvyzlmuh.cpp:603:10: warning: variable ‘L__self___net_0_weight’ set but not used [-Wunused-but-set-variable] 2024-08-08T22:14:17.9743493Z 603 | auto L__self___net_0_weight = constants_->at(0); 2024-08-08T22:14:17.9744176Z | ^~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9744832Z PASSED [8.0035s] [ 67%] 2024-08-08T22:14:17.9746738Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_buffer_mutation_4_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 68%] 2024-08-08T22:14:17.9749604Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_int_list_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.0139s] [ 70%] 2024-08-08T22:14:17.9752853Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_misc_1_max_autotune_True_non_abi_compatible_cpu E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-08T22:14:17.9754746Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:17.9755736Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-08T22:14:17.9764767Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.so 2024-08-08T22:14:17.9773059Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:17.9774035Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-08T22:14:17.9776811Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:17.9781113Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:17.9784018Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:17.9785641Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-08T22:14:17.9788778Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:17.9793119Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:17.9796065Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9797590Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9800544Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:17.9804425Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:17.9807884Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9810119Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] 208 | AOTI_TORCH_CHECK( 2024-08-08T22:14:17.9811386Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9812600Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:17.9816008Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:17.9818927Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:242:25: required from here 2024-08-08T22:14:17.9821269Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9823202Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9824440Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9825550Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:17.9827909Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:17.9830404Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:255:25: required from here 2024-08-08T22:14:17.9833058Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9834928Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9836164Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9837273Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:17.9838857Z E0808 21:52:55.387000 41421 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-08T22:14:17.9840747Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-08T22:14:17.9841718Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-08T22:14:17.9842518Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:17.9843421Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-08T22:14:17.9849962Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.so 2024-08-08T22:14:17.9856393Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:17.9857201Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-08T22:14:17.9859161Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:17.9862319Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:17.9864394Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:17.9865494Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-08T22:14:17.9867556Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:17.9871919Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:17.9874266Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9875438Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9877523Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:17.9880195Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:17.9882873Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9884545Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] 208 | AOTI_TORCH_CHECK( 2024-08-08T22:14:17.9885494Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9886390Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:17.9888664Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:17.9891152Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:242:25: required from here 2024-08-08T22:14:17.9893550Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9895455Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9896678Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9897766Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:17.9900112Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:17.9902595Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:255:25: required from here 2024-08-08T22:14:17.9904871Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/mx/cmxw476asnh2fy3qvjzsb3oe7bqs2kynndr2heikqyr6qstgn2tf.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9906767Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9908001Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9909096Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:17.9909963Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] . 2024-08-08T22:14:17.9910750Z E0808 21:52:57.066000 41421 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-08T22:14:17.9911668Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-08T22:14:17.9912635Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:17.9913365Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-08T22:14:17.9920637Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.so 2024-08-08T22:14:17.9926740Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:17.9927479Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-08T22:14:17.9929568Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:17.9932834Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:17.9934900Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:17.9935976Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-08T22:14:17.9938057Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:17.9941253Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:17.9943494Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9944762Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9946890Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:17.9949653Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:17.9952393Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9954103Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:17.9955064Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9955978Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:17.9958392Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:17.9960945Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:198:25: required from here 2024-08-08T22:14:17.9963337Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9965221Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9966533Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9967636Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:17.9970108Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:17.9972682Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:211:25: required from here 2024-08-08T22:14:17.9974991Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:17.9976870Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:17.9978116Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:17.9979196Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:17.9980817Z E0808 21:52:58.645000 41421 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-08T22:14:17.9982373Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-08T22:14:17.9983302Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-08T22:14:17.9984086Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:17.9984822Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-08T22:14:17.9991527Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.so 2024-08-08T22:14:17.9997724Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:17.9998457Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-08T22:14:18.0000406Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0003601Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0005732Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0006857Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0008933Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0012103Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0014315Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0016027Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0018160Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0021080Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0023666Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0025353Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0026297Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0027205Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0029499Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0032308Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:198:25: required from here 2024-08-08T22:14:18.0034631Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0036500Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0037733Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0038825Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0041270Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0043917Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:211:25: required from here 2024-08-08T22:14:18.0046626Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/4w/c4wpdwpvb3gez77mwkpjo5qfdh5fssco4aylzeb2b2qmn63mbq7n.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0048787Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0050216Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0051460Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0052439Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] . 2024-08-08T22:14:18.0053324Z E0808 21:53:00.299000 41421 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-08T22:14:18.0054361Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-08T22:14:18.0055351Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0056157Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-08T22:14:18.0062727Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.so 2024-08-08T22:14:18.0068818Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0069557Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-08T22:14:18.0071570Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0074939Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0077027Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0078112Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0080271Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0083499Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0085714Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0086890Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0089003Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0091724Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0094280Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0096024Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0096980Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0097905Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0100210Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0102729Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:198:25: required from here 2024-08-08T22:14:18.0105047Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0106982Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0108223Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0109321Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0111678Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0114359Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:211:25: required from here 2024-08-08T22:14:18.0117266Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0119322Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0120560Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0121657Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0123238Z E0808 21:53:01.967000 41421 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-08T22:14:18.0124808Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-08T22:14:18.0125737Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-08T22:14:18.0126521Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0127271Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-08T22:14:18.0135046Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.so 2024-08-08T22:14:18.0141206Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0141943Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-08T22:14:18.0143914Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0147190Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0149278Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0150350Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0152686Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0155977Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0158254Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0159433Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0161563Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0164311Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0166905Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0168584Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0169531Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0170503Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0172816Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0175343Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:198:25: required from here 2024-08-08T22:14:18.0177667Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0179548Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0180787Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0181888Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0184291Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0186808Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:211:25: required from here 2024-08-08T22:14:18.0189125Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] /tmp/tmpd5a19t04/eu/ceuyu4ps6vib6nxt33r6dt5vnrukvepgpfaunnp7trhvy6cev2vh.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0190996Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0192449Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0193591Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0194468Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] . 2024-08-08T22:14:18.0195264Z E0808 21:53:03.588000 41421 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-08T22:14:18.0196011Z PASSED [25.3849s] [ 72%] 2024-08-08T22:14:18.0197363Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_model_modified_weights_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.6519s] [ 74%] 2024-08-08T22:14:18.0199370Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_output_path_1_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.5487s] [ 75%] 2024-08-08T22:14:18.0201632Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_quantized_linear_non_abi_compatible_cpu [W808 21:53:50.209511082 QuantizedLinear.cpp:383] Warning: fbgemm_pack_gemm_matrix_fp16 is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-08T22:14:18.0203909Z [W808 21:54:05.088649903 QuantizedLinear.cpp:418] Warning: fbgemm_linear_fp16_weight_fp32_activation is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-08T22:14:18.0205123Z PASSED [14.9388s] [ 77%] 2024-08-08T22:14:18.0206435Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_reuse_kernel_dynamic_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [16.7956s] [ 79%] 2024-08-08T22:14:18.0208420Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_fp8_non_abi_compatible_cpu SKIPPED [0.0003s] (FP8 is only supported on H100+) [ 81%] 2024-08-08T22:14:18.0210534Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0045s] (requires CUDA) [ 82%] 2024-08-08T22:14:18.0212815Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_non_abi_compatible_cpu SKIPPED [0.0042s] (requires CUDA) [ 84%] 2024-08-08T22:14:18.0214879Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_zero_size_weight_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.5203s] [ 86%] 2024-08-08T22:14:18.0217323Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_2_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.3287s] [ 87%] 2024-08-08T22:14:18.0219474Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_int_list_input_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.8255s] [ 89%] 2024-08-08T22:14:18.0221509Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_nested_tensor_from_jagged_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4362s] [ 91%] 2024-08-08T22:14:18.0223610Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_scatter_reduce_fallback_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [15.7107s] [ 93%] 2024-08-08T22:14:18.0225573Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_equal_to_1_float_arg_dynamic_True_non_abi_compatible_cuda PASSED [16.2396s] [ 94%] 2024-08-08T22:14:18.0227599Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_non_abi_compatible_cuda PASSED [21.3739s] [ 96%] 2024-08-08T22:14:18.0229689Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_non_abi_compatible_cuda PASSED [16.2212s] [ 98%] 2024-08-08T22:14:18.0231532Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda PASSED [0.2965s] [100%] 2024-08-08T22:14:18.0232518Z 2024-08-08T22:14:18.0233409Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-ef9481da91af52d2.xml - 2024-08-08T22:14:18.0234845Z ================== 35 passed, 23 skipped in 434.02s (0:07:14) ================== 2024-08-08T22:14:18.0235590Z Got exit code -11 (SIGSEGV) 2024-08-08T22:14:18.0235966Z Retrying single test... 2024-08-08T22:14:18.0236901Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-301046bc9c4f8c99.xml 2024-08-08T22:14:18.0237971Z ============================= test session starts ============================== 2024-08-08T22:14:18.0238885Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:18.0239578Z cachedir: .pytest_cache 2024-08-08T22:14:18.0240449Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:18.0241376Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:18.0241779Z configfile: pytest.ini 2024-08-08T22:14:18.0242603Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:18.0243427Z collecting ... collected 912 items 2024-08-08T22:14:18.0243942Z stepcurrent: Cannot find last run test, not skipping 2024-08-08T22:14:18.0244431Z Running 58 items in this shard 2024-08-08T22:14:18.0244695Z 2024-08-08T22:14:18.0245666Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.2507s] [ 1%] 2024-08-08T22:14:18.0247662Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.2476s] [ 3%] 2024-08-08T22:14:18.0249637Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [8.3933s] [ 5%] 2024-08-08T22:14:18.0251555Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4551s] [ 6%] 2024-08-08T22:14:18.0253545Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.4233s] [ 8%] 2024-08-08T22:14:18.0255492Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 10%] 2024-08-08T22:14:18.0257663Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_abi_compatible_cpu <- test/inductor/test_torchinductor.py W0808 22:02:23.911000 46009 torch/_inductor/utils.py:1358] DeviceCopy in input program 2024-08-08T22:14:18.0259032Z PASSED [10.2430s] [ 12%] 2024-08-08T22:14:18.0260248Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.3285s] [ 13%] 2024-08-08T22:14:18.0262258Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [6.5090s] [ 15%] 2024-08-08T22:14:18.0264230Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [7.3952s] [ 17%] 2024-08-08T22:14:18.0266156Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_abi_compatible_cpu SKIPPED [0.0032s] (requires CUDA) [ 18%] 2024-08-08T22:14:18.0268157Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 20%] 2024-08-08T22:14:18.0270111Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_abi_compatible_cpu SKIPPED [0.0028s] (requires CUDA) [ 22%] 2024-08-08T22:14:18.0272308Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [7.4105s] [ 24%] 2024-08-08T22:14:18.0274659Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 25%] 2024-08-08T22:14:18.0277079Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [10.8771s] [ 27%] 2024-08-08T22:14:18.0279388Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py PASSED [8.3052s] [ 29%] 2024-08-08T22:14:18.0281637Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 31%] 2024-08-08T22:14:18.0303832Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 32%] 2024-08-08T22:14:18.0306327Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 34%] 2024-08-08T22:14:18.0308779Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 36%] 2024-08-08T22:14:18.0311460Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0032s] (requires CUDA) [ 37%] 2024-08-08T22:14:18.0314133Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_abi_compatible_cpu_with_stack_allocation <- test/inductor/test_torchinductor.py SKIPPED [0.0029s] (requires CUDA) [ 39%] 2024-08-08T22:14:18.0317120Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_abi_compatible_cpu_with_stack_allocation SKIPPED [0.0034s] (requires CUDA) [ 41%] 2024-08-08T22:14:18.0320630Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Skipped!) [ 43%] 2024-08-08T22:14:18.0324411Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_constant_original_fqn_and_dtype_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [15.5139s] [ 44%] 2024-08-08T22:14:18.0328360Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_dynamic_smem_above_default_limit_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 46%] 2024-08-08T22:14:18.0332252Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_fx_gm_return_tuple_validation_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [0.0141s] [ 48%] 2024-08-08T22:14:18.0335951Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_missing_output_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 50%] 2024-08-08T22:14:18.0339387Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_non_tensor_input_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py W0808 22:03:37.538000 46009 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:18.0341541Z W0808 22:03:37.551000 46009 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:18.0342544Z W0808 22:03:51.626000 46009 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:18.0343519Z W0808 22:03:51.645000 46009 torch/_dynamo/eval_frame.py:260] could not determine __code__ for aten.add 2024-08-08T22:14:18.0344301Z PASSED [29.0947s] [ 51%] 2024-08-08T22:14:18.0346211Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0030s] (requires CUDA) [ 53%] 2024-08-08T22:14:18.0349341Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface SKIPPED [0.0029s] (requires CUDA) [ 55%] 2024-08-08T22:14:18.0352605Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_view_outputs_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py PASSED [7.2047s] [ 56%] 2024-08-08T22:14:18.0355605Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface::test_while_loop_simple_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface <- test/inductor/test_torchinductor.py SKIPPED [0.0002s] (Skipped!) [ 58%] 2024-08-08T22:14:18.0358895Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_cond_non_tensor_predicates_dynamic_True_abi_compatible_cuda W0808 22:04:13.854000 46009 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:14:18.0362085Z W0808 22:04:13.862000 46009 torch/_higher_order_ops/cond.py:116] Pred is a Python constant. When used with torch.cond, it executes only one of the branches. If you want torch.cond to perserve two branches, please make the predicate a boolean tensor or a SymBool. 2024-08-08T22:14:18.0363653Z PASSED [8.4813s] [ 60%] 2024-08-08T22:14:18.0365154Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_dynamic_smem_above_default_limit_abi_compatible_cuda SKIPPED [0.0002s] (Test was marked as expected failure, but does not fail always anymore.) [ 62%] 2024-08-08T22:14:18.0367338Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_fake_tensor_device_validation_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [0.0349s] [ 63%] 2024-08-08T22:14:18.0369297Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_missing_cubin_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [29.0212s] [ 65%] 2024-08-08T22:14:18.0372755Z inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCuda::test_zero_size_weight_abi_compatible_cuda <- test/inductor/test_torchinductor.py /tmp/tmpudphu3hf/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/cvba6lvyimhi3kradxyynn57vhtdrcl6ghwatsskdo6irs4cimgm.cpp: In member function ‘void torch::aot_inductor::AOTInductorModel::run_impl(AtenTensorOpaque**, AtenTensorOpaque**, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle)’: 2024-08-08T22:14:18.0376428Z /tmp/tmpudphu3hf/cfs4xeeqht7ut4genqxabrexac2ub7jtnl5rphhzpvx3kutkhxgu/cvba6lvyimhi3kradxyynn57vhtdrcl6ghwatsskdo6irs4cimgm.cpp:603:10: warning: variable ‘L__self___net_0_weight’ set but not used [-Wunused-but-set-variable] 2024-08-08T22:14:18.0377968Z 603 | auto L__self___net_0_weight = constants_->at(0); 2024-08-08T22:14:18.0378463Z | ^~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0378936Z PASSED [8.0298s] [ 67%] 2024-08-08T22:14:18.0380311Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_buffer_mutation_4_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 68%] 2024-08-08T22:14:18.0382357Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_int_list_input_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.2217s] [ 70%] 2024-08-08T22:14:18.0384270Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_misc_1_max_autotune_True_non_abi_compatible_cpu E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-08T22:14:18.0385638Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0386369Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-08T22:14:18.0393111Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.so 2024-08-08T22:14:18.0399167Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0399905Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-08T22:14:18.0401935Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0405079Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0407139Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0408214Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0410275Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0413420Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0416063Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0417254Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0419369Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0422103Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0424696Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0426382Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] 208 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0427508Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0428419Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0430730Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0433352Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:242:25: required from here 2024-08-08T22:14:18.0435652Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0437607Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0438921Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0440013Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0442417Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0444925Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:255:25: required from here 2024-08-08T22:14:18.0447221Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0449069Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0450391Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0451481Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0453045Z E0808 22:05:16.458000 46009 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-08T22:14:18.0454592Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-08T22:14:18.0455521Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-08T22:14:18.0456295Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0457028Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-08T22:14:18.0463535Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.so 2024-08-08T22:14:18.0469542Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0470281Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-08T22:14:18.0472324Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0475595Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0477662Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0478727Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0480792Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0484004Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0486192Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0487432Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0489505Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0492211Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0494756Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:208:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0496424Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] 208 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0497365Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0498274Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0500610Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0503161Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:242:25: required from here 2024-08-08T22:14:18.0505454Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0507319Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0508623Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0509723Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0512217Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0514721Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:255:25: required from here 2024-08-08T22:14:18.0517564Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/qq/cqqltctg2k6szm7zjiy66pwzksyiqurvntze4jlcoyyhbwmndnyy.cpp:177:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0519448Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] 177 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0520682Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0521776Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0522921Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] . 2024-08-08T22:14:18.0523809Z E0808 22:05:18.132000 46009 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-08T22:14:18.0524833Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-08T22:14:18.0525816Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0526649Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-08T22:14:18.0533216Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.so 2024-08-08T22:14:18.0539218Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0539957Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-08T22:14:18.0541938Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0545088Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0547302Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0548371Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0550526Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0553803Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0555997Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0557183Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0559274Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0561968Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0564614Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0566297Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0567238Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0568159Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0570422Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0572971Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:198:25: required from here 2024-08-08T22:14:18.0575320Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0577190Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0578606Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0579700Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0582068Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0584670Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:211:25: required from here 2024-08-08T22:14:18.0587039Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0588907Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0590133Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0591232Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0592972Z E0808 22:05:19.745000 46009 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-08T22:14:18.0594535Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-08T22:14:18.0595452Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-08T22:14:18.0596233Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0597025Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-08T22:14:18.0603470Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.so 2024-08-08T22:14:18.0609538Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0610270Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-08T22:14:18.0612373Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0616283Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0618365Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0619437Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0621517Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0624948Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0627152Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0628339Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0630427Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0633394Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0635979Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0637747Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0638697Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0639617Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0641943Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0644504Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:198:25: required from here 2024-08-08T22:14:18.0646857Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0648743Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0650090Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0651204Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0653629Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0656190Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:211:25: required from here 2024-08-08T22:14:18.0658596Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/jz/cjz7gvxxxs6vitrmrqhfgacd6lev2x23j3yzknlpzt7v6mgmrt45.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0660541Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0661799Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0662905Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0663772Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] . 2024-08-08T22:14:18.0664583Z E0808 22:05:21.395000 46009 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-08T22:14:18.0665508Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] Exception C++ compile error 2024-08-08T22:14:18.0666354Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0667087Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] Command: 2024-08-08T22:14:18.0673842Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] g++ /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.so 2024-08-08T22:14:18.0680070Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] 2024-08-08T22:14:18.0680812Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] Output: 2024-08-08T22:14:18.0682770Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0685951Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0688149Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0689233Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0691329Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0694533Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0696827Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0698007Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0700157Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0702889Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0705463Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0707127Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0708080Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0708997Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0711270Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0713939Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:198:25: required from here 2024-08-08T22:14:18.0716589Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0718464Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0719713Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0720807Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0723299Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0725796Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:211:25: required from here 2024-08-08T22:14:18.0728083Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0729941Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0731175Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0732362Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] | TORCH_CHECK 2024-08-08T22:14:18.0734012Z E0808 22:05:23.058000 46009 torch/_inductor/select_algorithm.py:1305] for benchmark choice DataProcessorChoiceCallerWrapper() 2024-08-08T22:14:18.0735580Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] Runtime error during autotuning: 2024-08-08T22:14:18.0736509Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] C++ compile error 2024-08-08T22:14:18.0737294Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0738035Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] Command: 2024-08-08T22:14:18.0744638Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] g++ /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp -D TORCH_INDUCTOR_CPP_WRAPPER -D C10_USING_CUSTOM_GENERATED_MACROS -D CPU_CAPABILITY_AVX2 -shared -fPIC -O3 -DNDEBUG -ffast-math -fno-finite-math-only -fno-unsafe-math-optimizations -ffp-contract=off -march=native -Wall -std=c++17 -Wno-unused-variable -Wno-unknown-pragmas -fopenmp -I/opt/conda/envs/py_3.10/include/python3.10 -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -I/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -mavx2 -mfma -mf16c -D_GLIBCXX_USE_CXX11_ABI=1 -ltorch -ltorch_cpu -ltorch_python -lc10 -lgomp -L/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -o /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.so 2024-08-08T22:14:18.0750834Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] 2024-08-08T22:14:18.0751571Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] Output: 2024-08-08T22:14:18.0753687Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In function ‘void cpp_packed_gemm_micro_gemm_kernel(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0756851Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:19:11: warning: typedef ‘using VectorizedIn = class at::vec::CPU_CAPABILITY::Vectorized’ locally defined but not used [-Wunused-local-typedefs] 2024-08-08T22:14:18.0758926Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] 19 | using VectorizedIn = at::vec::Vectorized; 2024-08-08T22:14:18.0760005Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~ 2024-08-08T22:14:18.0762155Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In function ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t)’: 2024-08-08T22:14:18.0765298Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:21: error: there are no arguments to ‘AOTI_TORCH_CHECK’ that depend on a template parameter, so a declaration of ‘AOTI_TORCH_CHECK’ must be available [-fpermissive] 2024-08-08T22:14:18.0767504Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0768684Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0770855Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:21: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) 2024-08-08T22:14:18.0773624Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In function ‘void cpp_packed_gemm(const float*, const float*, const float*, float*)’: 2024-08-08T22:14:18.0776191Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:164:5: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0777888Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] 164 | AOTI_TORCH_CHECK( 2024-08-08T22:14:18.0778850Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | ^~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0779771Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0782074Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = false; int64_t = long int]’: 2024-08-08T22:14:18.0784653Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:198:25: required from here 2024-08-08T22:14:18.0787047Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0788950Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0790194Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0791300Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0793786Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp: In instantiation of ‘void cpp_packed_gemm_micro_gemm(const float*, const float*, float*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t) [with bool accum = true; int64_t = long int]’: 2024-08-08T22:14:18.0796306Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:211:25: required from here 2024-08-08T22:14:18.0798724Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] /tmp/tmp3oo5yc75/k6/ck6acc3rwtkdaef4j35x4rsk7tj7n3aih77yjuiemj6hiufy2jut.cpp:133:37: error: ‘AOTI_TORCH_CHECK’ was not declared in this scope; did you mean ‘TORCH_CHECK’? 2024-08-08T22:14:18.0800609Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] 133 | AOTI_TORCH_CHECK(false, "Unsupported block_m: ", block_m); 2024-08-08T22:14:18.0801903Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2024-08-08T22:14:18.0803008Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] | TORCH_CHECK 2024-08-08T22:14:18.0803882Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] . 2024-08-08T22:14:18.0804739Z E0808 22:05:24.690000 46009 torch/_inductor/select_algorithm.py:1508] Ignoring this choice. 2024-08-08T22:14:18.0805470Z PASSED [25.4542s] [ 72%] 2024-08-08T22:14:18.0806818Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_model_modified_weights_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.8174s] [ 74%] 2024-08-08T22:14:18.0808872Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_output_path_1_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.6109s] [ 75%] 2024-08-08T22:14:18.0811157Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_quantized_linear_non_abi_compatible_cpu [W808 22:06:11.551339482 QuantizedLinear.cpp:383] Warning: fbgemm_pack_gemm_matrix_fp16 is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-08T22:14:18.0813406Z [W808 22:06:26.504160511 QuantizedLinear.cpp:418] Warning: fbgemm_linear_fp16_weight_fp32_activation is deprecated and will be removed in a future PyTorch release. (function operator()) 2024-08-08T22:14:18.0814580Z PASSED [14.9983s] [ 77%] 2024-08-08T22:14:18.0816154Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_reuse_kernel_dynamic_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [16.7532s] [ 79%] 2024-08-08T22:14:18.0818142Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_runtime_checks_fp8_non_abi_compatible_cpu SKIPPED [0.0002s] (FP8 is only supported on H100+) [ 81%] 2024-08-08T22:14:18.0820498Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py SKIPPED [0.0032s] (requires CUDA) [ 82%] 2024-08-08T22:14:18.0822742Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_non_abi_compatible_cpu SKIPPED [0.0030s] (requires CUDA) [ 84%] 2024-08-08T22:14:18.0824821Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCpu::test_zero_size_weight_non_abi_compatible_cpu <- test/inductor/test_torchinductor.py PASSED [15.7485s] [ 86%] 2024-08-08T22:14:18.0826822Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_buffer_mutation_2_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.4441s] [ 87%] 2024-08-08T22:14:18.0828804Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_int_list_input_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.0693s] [ 89%] 2024-08-08T22:14:18.0830850Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_nested_tensor_from_jagged_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [17.2552s] [ 91%] 2024-08-08T22:14:18.0833168Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_scatter_reduce_fallback_non_abi_compatible_cuda <- test/inductor/test_torchinductor.py PASSED [16.2835s] [ 93%] 2024-08-08T22:14:18.0835156Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_equal_to_1_float_arg_dynamic_True_non_abi_compatible_cuda PASSED [16.4448s] [ 94%] 2024-08-08T22:14:18.0837096Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_non_abi_compatible_cuda PASSED [21.5168s] [ 96%] 2024-08-08T22:14:18.0839134Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_non_abi_compatible_cuda PASSED [16.7316s] [ 98%] 2024-08-08T22:14:18.0840977Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda PASSED [0.2960s] [100%] 2024-08-08T22:14:18.0841923Z 2024-08-08T22:14:18.0842847Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-301046bc9c4f8c99.xml - 2024-08-08T22:14:18.0844388Z ================== 35 passed, 23 skipped in 436.12s (0:07:16) ================== 2024-08-08T22:14:18.0845124Z Got exit code -11 (SIGSEGV) 2024-08-08T22:14:18.0845498Z Retrying single test... 2024-08-08T22:14:18.0846455Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-c1412f5ee19cf13a.xml 2024-08-08T22:14:18.0847536Z ============================= test session starts ============================== 2024-08-08T22:14:18.0848453Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:18.0849160Z cachedir: .pytest_cache 2024-08-08T22:14:18.0850036Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:18.0850896Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:18.0851306Z configfile: pytest.ini 2024-08-08T22:14:18.0852096Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:18.0853083Z collecting ... collected 912 items / 57 deselected / 855 selected 2024-08-08T22:14:18.0854391Z stepcurrent: skipping 57 already run items. Running only test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda 2024-08-08T22:14:18.0855598Z Running 1 items in this shard 2024-08-08T22:14:18.0855848Z 2024-08-08T22:14:18.0856680Z inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda PASSED [1.4465s] [100%] 2024-08-08T22:14:18.0857554Z 2024-08-08T22:14:18.0858461Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-c1412f5ee19cf13a.xml - 2024-08-08T22:14:18.0859897Z ======================= 1 passed, 57 deselected in 1.53s ======================= 2024-08-08T22:14:18.0860572Z Got exit code 0 2024-08-08T22:14:18.0861053Z Test succeeeded in new process, continuing with the rest of the tests 2024-08-08T22:14:18.0862232Z Test results will be stored in test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-85bbd9fa06eef850.xml 2024-08-08T22:14:18.0863315Z ============================= test session starts ============================== 2024-08-08T22:14:18.0864236Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.5.0 -- /opt/conda/envs/py_3.10/bin/python 2024-08-08T22:14:18.0864940Z cachedir: .pytest_cache 2024-08-08T22:14:18.0865822Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2024-08-08T22:14:18.0866674Z rootdir: /var/lib/jenkins/workspace 2024-08-08T22:14:18.0867157Z configfile: pytest.ini 2024-08-08T22:14:18.0867940Z plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, xdist-3.3.1, xdoctest-1.1.0 2024-08-08T22:14:18.0868916Z collecting ... collected 912 items / 58 deselected / 854 selected 2024-08-08T22:14:18.0869510Z stepcurrent: skipping 58 already run items. 2024-08-08T22:14:18.0869971Z Running 0 items in this shard 2024-08-08T22:14:18.0870223Z 2024-08-08T22:14:18.0871127Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-85bbd9fa06eef850.xml - 2024-08-08T22:14:18.0872684Z ============================ 58 deselected in 0.07s ============================ 2024-08-08T22:14:18.0874354Z The following tests failed and then succeeded when run in a new process['ul', 'test/inductor/test_aot_inductor.py::AOTInductorTestNonABICompatibleCuda::test_unsupported_input_dtype_non_abi_compatible_cuda'] 2024-08-08T22:14:18.0875531Z 2024-08-08T22:14:18.0876247Z FINISHED PRINTING LOG FILE of inductor/test_aot_inductor 7/16 (test/test-reports/inductor.test_aot_inductor_7.16_456039372a27472d_.log) 2024-08-08T22:14:18.0877003Z 2024-08-08T22:14:21.1156231Z Running inductor/test_compiled_optimizers 4/4 ... [2024-08-08 22:14:21.114804] 2024-08-08T22:14:21.1158211Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '-m', 'not serial', '--shard-id=4', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 22:14:21.115230] 2024-08-08T22:18:53.1327682Z 2024-08-08T22:18:53.1330116Z inductor/test_compiled_optimizers 2/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_2.4_b43bac4adf31f447_.log 2024-08-08T22:18:53.1457454Z Running 128 items in this shard: test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_initial_accumulator_value_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_recompile_default, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_t0_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_decoupled_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_ASGD_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_ASGD_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adadelta_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adagrad_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adagrad_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_AdamW_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_AdamW_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adamax_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_NAdam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_NAdam_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RMSprop_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SGD_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SparseAdam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SparseAdam_use_closure_True_cuda_float32 2024-08-08T22:18:53.1592841Z 2024-08-08T22:18:56.3252406Z Running inductor/test_cuda_cpp_wrapper 3/4 ... [2024-08-08 22:18:56.324613] 2024-08-08T22:18:56.3255969Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_cpp_wrapper.py', '-m', 'not serial', '--shard-id=3', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 22:18:56.325025] 2024-08-08T22:20:29.3714457Z 2024-08-08T22:20:29.3716569Z inductor/test_compiled_optimizers 3/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_3.4_0f5b5bc62f9afca6_.log 2024-08-08T22:20:29.3843442Z Running 150 items in this shard: test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_initial_accumulator_value_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_lr_decay_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_compile_time_smoketest, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_momentum_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_eps_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_step_sizes_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adafactor_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_LBFGS_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RMSprop_use_closure_True_cuda_float32 2024-08-08T22:20:29.3950345Z 2024-08-08T22:20:32.5798859Z Running inductor/test_cpu_cpp_wrapper 1/1 ... [2024-08-08 22:20:32.579302] 2024-08-08T22:20:32.5801595Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_cpp_wrapper.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-08-08 22:20:32.579688] 2024-08-08T22:20:39.2515007Z 2024-08-08T22:20:39.2516905Z inductor/test_cpu_cpp_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_cpp_wrapper_1.1_f23bf753e2964f7d_.log 2024-08-08T22:20:39.2517886Z 2024-08-08T22:23:23.2660122Z 2024-08-08T22:23:23.2662074Z inductor/test_compiled_optimizers 4/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_4.4_a9993a0dd11ff1df_.log 2024-08-08T22:23:23.2751775Z Running 158 items in this shard: test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_lr_decay_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_t0_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_basic_shampoo, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_closure_graph_break, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_get_value_on_static_address, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_decoupled_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_eps_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_decoupled_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_step_sizes_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_step_sizes_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_recompile_foreach, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_recompile_single, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adam_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adamax_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RAdam_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Rprop_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_SGD_use_closure_False_cuda_float32 2024-08-08T22:23:23.2839886Z 2024-08-08T22:33:16.8944222Z 2024-08-08T22:33:16.8946312Z inductor/test_cuda_cpp_wrapper 3/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_cpp_wrapper_3.4_c69b2b2988173fe1_.log 2024-08-08T22:33:16.8968504Z Running 27 items in this shard: test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_add_complex4_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_cat_slice_cat_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_custom_op_3_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_foreach_cpp_wrapper_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_inductor_layout_optimization_input_mutations_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_layer_norm_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_pointwise_hermite_polynomial_h_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_profiler_mark_wrapper_call_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_relu_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_sum_dtype_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_sum_int_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::TestCudaWrapper::test_transpose_cuda_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_add_complex_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_bmm1_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_buffer_use_after_remove_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_cat_slice_cat_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_dtypeview_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_foreach_cpp_wrapper_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_index_tensor_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_insignificant_strides_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_linear_relu_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_mm_views_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_pointwise_hermite_polynomial_he_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_pow3_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_profiler_mark_wrapper_call_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_reduction1_cuda_dynamic_shapes_cuda_wrapper, test/inductor/test_cuda_cpp_wrapper.py::DynamicShapesCudaWrapperCudaTests::test_roi_align_cuda_dynamic_shapes_cuda_wrapper 2024-08-08T22:33:16.8991105Z 2024-08-08T22:33:18.1950545Z 2024-08-08T22:33:18.1951053Z real 90m13.623s 2024-08-08T22:33:18.1951446Z user 154m23.007s 2024-08-08T22:33:18.1951762Z sys 11m26.624s 2024-08-08T22:33:18.1954923Z + assert_git_not_dirty 2024-08-08T22:33:18.1955980Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-08T22:33:18.1956681Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *xla* ]] 2024-08-08T22:33:18.1957541Z ++ git status --porcelain 2024-08-08T22:33:18.1958537Z ++ grep -v '?? third_party' 2024-08-08T22:33:20.6043111Z ++ true 2024-08-08T22:33:20.6045630Z + git_status= 2024-08-08T22:33:20.6046250Z + [[ -n '' ]] 2024-08-08T22:33:20.6046623Z + test_libtorch 2 2024-08-08T22:33:20.6046921Z + local SHARD=2 2024-08-08T22:33:20.6047248Z + [[ default != \s\l\o\w ]] 2024-08-08T22:33:20.6047649Z + echo 'Testing libtorch' 2024-08-08T22:33:20.6047989Z Testing libtorch 2024-08-08T22:33:20.6049098Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libbackend_with_compiler.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6066971Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libjitbackend_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6080727Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libcaffe2_nvrtc.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6096584Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10d_cuda_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6110882Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libshm.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6128996Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda_linalg.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_python.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorchbind_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6144008Z + ln -sf '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libnvfuser*' /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6157657Z + export CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6158583Z + CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:33:20.6159210Z + [[ -z 2 ]] 2024-08-08T22:33:20.6159501Z + [[ 2 == \1 ]] 2024-08-08T22:33:20.6159822Z + [[ -z 2 ]] 2024-08-08T22:33:20.6160133Z + [[ 2 == \2 ]] 2024-08-08T22:33:20.6160438Z + test_libtorch_jit 2024-08-08T22:33:20.6160746Z + pushd test 2024-08-08T22:33:20.6161045Z ~/workspace/test ~/workspace 2024-08-08T22:33:20.6161447Z + python cpp/jit/tests_setup.py setup 2024-08-08T22:33:22.5073596Z + popd 2024-08-08T22:33:22.5073942Z ~/workspace 2024-08-08T22:33:22.5074626Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 == *cuda* ]] 2024-08-08T22:33:22.5075123Z + [[ default != *nogpu* ]] 2024-08-08T22:33:22.5075473Z + LTC_TS_CUDA=1 2024-08-08T22:33:22.5076032Z + python test/run_test.py --cpp --verbose -i cpp/test_jit cpp/test_lazy 2024-08-08T22:33:22.6062037Z /var/lib/jenkins/workspace/test/run_test.py:21: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html 2024-08-08T22:33:22.6063315Z import pkg_resources 2024-08-08T22:33:26.1256118Z Downloading https://ossci-metrics.s3.amazonaws.com/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2024-08-08T22:33:26.1259369Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2024-08-08T22:33:26.1366689Z Found test times from artifacts 2024-08-08T22:33:26.1798123Z Found test times from artifacts 2024-08-08T22:33:26.1812654Z Running 25% of tests based on TD 2024-08-08T22:33:26.1816488Z Running parallel tests on 3 processes 2024-08-08T22:33:26.1817102Z Name: tests to run (est. time: 0.0min) 2024-08-08T22:33:26.1817633Z Serial tests (0): 2024-08-08T22:33:26.1817983Z Parallel tests (1): 2024-08-08T22:33:26.1818320Z cpp/test_jit 1/1 2024-08-08T22:33:26.1818660Z Name: excluded (est. time: 0.0min) 2024-08-08T22:33:26.1821152Z Serial tests (0): 2024-08-08T22:33:26.1821479Z Parallel tests (1): 2024-08-08T22:33:26.1821805Z cpp/test_lazy 1/1 2024-08-08T22:33:26.1822481Z Starting test batch 'tests to run' 0.0 seconds after initiating testing 2024-08-08T22:33:26.1911297Z Running cpp/test_jit 1/1 ... [2024-08-08 22:33:26.190735] 2024-08-08T22:33:26.1918075Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_jit', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-4378125074b4389d.xml', '-x', '--reruns=2'] ... [2024-08-08 22:33:26.191343] 2024-08-08T22:33:28.8624988Z 2024-08-08T22:33:28.8627598Z cpp/test_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_jit_1.1_c290a18a0a023e8e_.log 2024-08-08T22:33:28.8628421Z 2024-08-08T22:33:28.8630837Z Running cpp/test_jit 1/1 ... [2024-08-08 22:33:28.862739] 2024-08-08T22:33:28.8637785Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_jit', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-3f438bc4418e0618.xml', '-x', '--reruns=2'] ... [2024-08-08 22:33:28.863351] 2024-08-08T22:35:38.9514743Z 2024-08-08T22:35:38.9516340Z cpp/test_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_jit_1.1_d1ef5241d4903eda_.log 2024-08-08T22:35:38.9524819Z 2024-08-08T22:35:40.0740799Z + pushd test 2024-08-08T22:35:40.0741720Z ~/workspace/test ~/workspace 2024-08-08T22:35:40.0742724Z + python cpp/jit/tests_setup.py shutdown 2024-08-08T22:35:41.7442478Z + popd 2024-08-08T22:35:41.7442814Z ~/workspace 2024-08-08T22:35:41.7444594Z + assert_git_not_dirty 2024-08-08T22:35:41.7445307Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-08T22:35:41.7446313Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *xla* ]] 2024-08-08T22:35:41.7448986Z ++ git status --porcelain 2024-08-08T22:35:41.7450717Z ++ grep -v '?? third_party' 2024-08-08T22:35:41.9840007Z ++ true 2024-08-08T22:35:41.9841966Z + git_status= 2024-08-08T22:35:41.9842607Z + [[ -n '' ]] 2024-08-08T22:35:41.9842915Z + test_aot_compilation 2024-08-08T22:35:41.9843508Z + echo 'Testing Ahead of Time compilation' 2024-08-08T22:35:41.9843964Z Testing Ahead of Time compilation 2024-08-08T22:35:41.9846060Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libc10d_cuda_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:35:41.9865107Z + ln -sf /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_cuda_linalg.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_global_deps.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorch_python.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib/libtorchbind_test.so /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:35:41.9881657Z + '[' -f /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_mobile_nnc ']' 2024-08-08T22:35:41.9882617Z + CPP_TESTS_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:35:41.9883432Z + python test/run_test.py --cpp --verbose -i cpp/test_mobile_nnc 2024-08-08T22:35:42.0853810Z /var/lib/jenkins/workspace/test/run_test.py:21: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html 2024-08-08T22:35:42.0854886Z import pkg_resources 2024-08-08T22:35:45.5645092Z Downloading https://ossci-metrics.s3.amazonaws.com/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2024-08-08T22:35:45.5646720Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2024-08-08T22:35:45.5753913Z Found test times from artifacts 2024-08-08T22:35:45.6185488Z Found test times from artifacts 2024-08-08T22:35:45.6199829Z Running 25% of tests based on TD 2024-08-08T22:35:45.6203061Z Running parallel tests on 3 processes 2024-08-08T22:35:45.6203607Z Name: tests to run (est. time: 0.0min) 2024-08-08T22:35:45.6204151Z Serial tests (0): 2024-08-08T22:35:45.6204548Z Parallel tests (1): 2024-08-08T22:35:45.6205209Z cpp/test_mobile_nnc 1/1 2024-08-08T22:35:45.6207748Z Name: excluded (est. time: 0.0min) 2024-08-08T22:35:45.6208303Z Serial tests (0): 2024-08-08T22:35:45.6208722Z Parallel tests (0): 2024-08-08T22:35:45.6209360Z Starting test batch 'tests to run' 0.0 seconds after initiating testing 2024-08-08T22:35:45.6292112Z Running cpp/test_mobile_nnc 1/1 ... [2024-08-08 22:35:45.628862] 2024-08-08T22:35:45.6298048Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_mobile_nnc', '-m', 'serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-2c573435e93ec042.xml', '-x', '--reruns=2'] ... [2024-08-08 22:35:45.629377] 2024-08-08T22:35:47.7990494Z 2024-08-08T22:35:47.7992013Z cpp/test_mobile_nnc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_mobile_nnc_1.1_fc078f2c7b1cc7f8_.log 2024-08-08T22:35:47.7993207Z 2024-08-08T22:35:48.1895179Z Running cpp/test_mobile_nnc 1/1 ... [2024-08-08 22:35:48.188872] 2024-08-08T22:35:48.1899550Z Executing ['pytest', '/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/test_mobile_nnc', '-m', 'not serial', '-v', '-vv', '-rfEX', '-n', '3', '--junit-xml-reruns', 'test-reports/python-pytest/test.run_test/test.run_test-ea7adcca63016e48.xml', '-x', '--reruns=2'] ... [2024-08-08 22:35:48.189458] 2024-08-08T22:35:51.9129535Z 2024-08-08T22:35:51.9131695Z cpp/test_mobile_nnc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp.test_mobile_nnc_1.1_cd31fac5c9fbda4a_.log 2024-08-08T22:35:51.9132640Z 2024-08-08T22:35:53.0215606Z + '[' -f /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin/aot_model_compiler_test ']' 2024-08-08T22:35:53.0216416Z + source test/mobile/nnc/test_aot_compile.sh 2024-08-08T22:35:53.0216911Z ++ set -e -o pipefail 2024-08-08T22:35:53.0219225Z +++ python -c 'import site; print(site.getsitepackages()[0])' 2024-08-08T22:35:53.0387693Z ++ TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2024-08-08T22:35:53.0389116Z ++ TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2024-08-08T22:35:53.0391983Z +++ dirname test/mobile/nnc/test_aot_compile.sh 2024-08-08T22:35:53.0407224Z ++ CURRENT_DIR=test/mobile/nnc 2024-08-08T22:35:53.0407663Z ++ MODEL=aot_test_model.pt 2024-08-08T22:35:53.0408117Z ++ COMPILED_MODEL=aot_test_model.compiled.pt 2024-08-08T22:35:53.0408595Z ++ COMPILED_CODE=aot_test_model.compiled.ll 2024-08-08T22:35:53.0411893Z +++ mktemp -d -t build_XXX 2024-08-08T22:35:53.0429683Z ++ TMP_DIR=/tmp/build_fwg 2024-08-08T22:35:53.0430706Z + test_custom_script_ops 2024-08-08T22:35:53.0431347Z + echo 'Testing custom script operators' 2024-08-08T22:35:53.0432016Z Testing custom script operators 2024-08-08T22:35:53.0432747Z + CUSTOM_OP_BUILD=/var/lib/jenkins/workspace/build/custom_test_artifacts/custom-op-build 2024-08-08T22:35:53.0433428Z + pushd test/custom_operator 2024-08-08T22:35:53.0433846Z ~/workspace/test/custom_operator ~/workspace 2024-08-08T22:35:53.0434609Z + cp -a /var/lib/jenkins/workspace/build/custom_test_artifacts/custom-op-build build 2024-08-08T22:35:53.0612567Z + python test_custom_ops.py -v 2024-08-08T22:35:55.4055867Z /var/lib/jenkins/workspace/test/custom_operator/pointwise.py:11: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-08T22:35:55.4058863Z @torch.library.impl_abstract("custom::cos") 2024-08-08T22:35:55.4244564Z /var/lib/jenkins/workspace/test/custom_operator/pointwise.py:17: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-08T22:35:55.4247693Z @torch.library.impl_abstract("custom::tan") 2024-08-08T22:35:55.4275624Z Test results will be stored in test-reports/python-unittest/test_custom_ops 2024-08-08T22:35:55.4292813Z 2024-08-08T22:35:55.4293466Z Running tests... 2024-08-08T22:35:55.4293965Z ---------------------------------------------------------------------- 2024-08-08T22:35:55.4923444Z test_abstract_impl_pystub_faketensor (__main__.TestCustomOperators) ... /var/lib/jenkins/workspace/test/custom_operator/my_custom_ops.py:9: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-08T22:35:55.4925331Z @torch.library.impl_abstract("custom::nonzero") 2024-08-08T22:35:55.5017043Z /var/lib/jenkins/workspace/test/custom_operator/my_custom_ops.py:13: FutureWarning: `create_unbacked_symint` is deprecated, please use `new_dynamic_size` instead 2024-08-08T22:35:55.5018955Z nnz = ctx.create_unbacked_symint() 2024-08-08T22:35:55.5100477Z ok (0.080s) 2024-08-08T22:35:55.5118336Z test_abstract_impl_pystub_meta (__main__.TestCustomOperators) ... /var/lib/jenkins/workspace/test/custom_operator/my_custom_ops2.py:9: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch. 2024-08-08T22:35:55.5120304Z @torch.library.impl_abstract("custom::sin") 2024-08-08T22:35:55.5142988Z ok (0.004s) 2024-08-08T22:35:55.5167325Z test_calling_custom_op (__main__.TestCustomOperators) ... ok (0.002s) 2024-08-08T22:35:55.5634777Z test_calling_custom_op_inside_script_module (__main__.TestCustomOperators) ... ok (0.047s) 2024-08-08T22:35:55.5642011Z test_calling_custom_op_string (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-08T22:35:55.5677533Z test_calling_custom_op_with_autograd (__main__.TestCustomOperators) ... /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:812: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1203.) 2024-08-08T22:35:55.5681156Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2024-08-08T22:35:55.5690423Z ok (0.005s) 2024-08-08T22:35:55.5698598Z test_calling_custom_op_with_autograd_in_nograd_mode (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-08T22:35:55.5702319Z test_custom_library_is_loaded (__main__.TestCustomOperators) ... ok (0.000s) 2024-08-08T22:35:55.7751255Z test_dynamo_pystub_suggestion (__main__.TestCustomOperators) ... ok (0.205s) 2024-08-08T22:35:55.9239156Z test_op_with_incorrect_abstract_impl_pystub (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-08T22:35:55.9247049Z test_op_with_no_abstract_impl_pystub (__main__.TestCustomOperators) ... ok (0.001s) 2024-08-08T22:35:55.9338016Z test_saving_and_loading_script_module_with_custom_op (__main__.TestCustomOperators) ... ok (0.009s) 2024-08-08T22:35:55.9338918Z 2024-08-08T22:35:55.9339560Z ---------------------------------------------------------------------- 2024-08-08T22:35:55.9340040Z Ran 12 tests in 0.505s 2024-08-08T22:35:55.9340273Z 2024-08-08T22:35:55.9340379Z OK 2024-08-08T22:35:55.9340529Z 2024-08-08T22:35:55.9340661Z Generating XML reports... 2024-08-08T22:35:55.9388335Z Generated XML report: test-reports/python-unittest/test_custom_ops/TEST-TestCustomOperators-20240808223555.xml 2024-08-08T22:35:56.5485389Z + python model.py --export-script-module=model.pt 2024-08-08T22:35:58.2102432Z + build/test_custom_ops ./model.pt 2024-08-08T22:35:58.6852365Z [W808 22:35:58.653025435 engine.cpp:1203] Warning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (function operator()) 2024-08-08T22:35:59.0178406Z ok 2024-08-08T22:35:59.2278064Z + popd 2024-08-08T22:35:59.2278419Z ~/workspace 2024-08-08T22:35:59.2278728Z + assert_git_not_dirty 2024-08-08T22:35:59.2279447Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-08T22:35:59.2280098Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *xla* ]] 2024-08-08T22:35:59.2286420Z ++ git status --porcelain 2024-08-08T22:35:59.2286842Z ++ grep -v '?? third_party' 2024-08-08T22:35:59.4689341Z ++ true 2024-08-08T22:35:59.4691134Z + git_status= 2024-08-08T22:35:59.4691677Z + [[ -n '' ]] 2024-08-08T22:35:59.4691978Z + test_custom_backend 2024-08-08T22:35:59.4692676Z + echo 'Testing custom backends' 2024-08-08T22:35:59.4693069Z Testing custom backends 2024-08-08T22:35:59.4693840Z + CUSTOM_BACKEND_BUILD=/var/lib/jenkins/workspace/build/custom_test_artifacts/custom-backend-build 2024-08-08T22:35:59.4694555Z + pushd test/custom_backend 2024-08-08T22:35:59.4694962Z ~/workspace/test/custom_backend ~/workspace 2024-08-08T22:35:59.4695858Z + cp -a /var/lib/jenkins/workspace/build/custom_test_artifacts/custom-backend-build build 2024-08-08T22:35:59.4854607Z + python test_custom_backend.py -v 2024-08-08T22:36:01.7988585Z Test results will be stored in test-reports/python-unittest/test_custom_backend 2024-08-08T22:36:01.7999922Z 2024-08-08T22:36:01.8000260Z Running tests... 2024-08-08T22:36:01.8000750Z ---------------------------------------------------------------------- 2024-08-08T22:36:01.8004944Z test_execute (__main__.TestCustomBackend) 2024-08-08T22:36:01.8540482Z Test execution using the custom backend. ... ok (0.054s) 2024-08-08T22:36:01.8543839Z test_save_load (__main__.TestCustomBackend) 2024-08-08T22:36:01.8737525Z Test that a lowered module can be executed correctly ... ok (0.020s) 2024-08-08T22:36:01.8738431Z 2024-08-08T22:36:01.8739035Z ---------------------------------------------------------------------- 2024-08-08T22:36:01.8739977Z Ran 2 tests in 0.074s 2024-08-08T22:36:01.8740403Z 2024-08-08T22:36:01.8740565Z OK 2024-08-08T22:36:01.8740706Z 2024-08-08T22:36:01.8740850Z Generating XML reports... 2024-08-08T22:36:01.8763684Z Generated XML report: test-reports/python-unittest/test_custom_backend/TEST-TestCustomBackend-20240808223601.xml 2024-08-08T22:36:02.3987582Z + python backend.py --export-module-to=model.pt 2024-08-08T22:36:04.1108876Z + build/test_custom_backend ./model.pt 2024-08-08T22:36:04.6056520Z Testing custom_backend 2024-08-08T22:36:04.6736723Z OK 2024-08-08T22:36:04.7951239Z + rm -f ./model.pt 2024-08-08T22:36:04.7967888Z + popd 2024-08-08T22:36:04.7968205Z ~/workspace 2024-08-08T22:36:04.7968580Z + assert_git_not_dirty 2024-08-08T22:36:04.7969300Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-08T22:36:04.7970091Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *xla* ]] 2024-08-08T22:36:04.7976982Z ++ git status --porcelain 2024-08-08T22:36:04.7977708Z ++ grep -v '?? third_party' 2024-08-08T22:36:05.0407425Z ++ true 2024-08-08T22:36:05.0409846Z + git_status= 2024-08-08T22:36:05.0410531Z + [[ -n '' ]] 2024-08-08T22:36:05.0410890Z + test_torch_function_benchmark 2024-08-08T22:36:05.0411390Z + echo 'Testing __torch_function__ benchmarks' 2024-08-08T22:36:05.0411861Z Testing __torch_function__ benchmarks 2024-08-08T22:36:05.0412299Z + pushd benchmarks/overrides_benchmark 2024-08-08T22:36:05.0412819Z ~/workspace/benchmarks/overrides_benchmark ~/workspace 2024-08-08T22:36:05.0413364Z + python bench.py -n 1 -m 2 2024-08-08T22:36:06.4176432Z Type tensor had a minimum time of 0.0054836273193359375 us and a standard deviation of 0.35841678618453443 us. 2024-08-08T22:36:06.4177898Z Type SubTensor had a minimum time of 0.013113021850585938 us and a standard deviation of 0.03759498940780759 us. 2024-08-08T22:36:06.4179497Z Type WithTorchFunction had a minimum time of 0.0064373016357421875 us and a standard deviation of 0.0215791860682657 us. 2024-08-08T22:36:06.4180698Z Type SubWithTorchFunction had a minimum time of 0.0209808349609375 us and a standard deviation of 0.004214684850012418 us. 2024-08-08T22:36:06.7030901Z + python pyspybench.py Tensor -n 1 2024-08-08T22:36:08.3634719Z + python pyspybench.py SubTensor -n 1 2024-08-08T22:36:10.0338447Z + python pyspybench.py WithTorchFunction -n 1 2024-08-08T22:36:11.7004457Z + python pyspybench.py SubWithTorchFunction -n 1 2024-08-08T22:36:13.3517635Z + popd 2024-08-08T22:36:13.3518005Z ~/workspace 2024-08-08T22:36:13.3518308Z + assert_git_not_dirty 2024-08-08T22:36:13.3518995Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *rocm* ]] 2024-08-08T22:36:13.3519638Z + [[ linux-focal-cuda12.4-py3.10-gcc9-sm86 != *xla* ]] 2024-08-08T22:36:13.3525113Z ++ git status --porcelain 2024-08-08T22:36:13.3526641Z ++ grep -v '?? third_party' 2024-08-08T22:36:13.5926092Z ++ true 2024-08-08T22:36:13.5928197Z + git_status= 2024-08-08T22:36:13.5929128Z + [[ -n '' ]] 2024-08-08T22:36:13.5929589Z + cleanup_workspace 2024-08-08T22:36:13.5930502Z + echo 'sudo may print the following warning message that can be ignored. The chown command will still run.' 2024-08-08T22:36:13.5931788Z sudo may print the following warning message that can be ignored. The chown command will still run. 2024-08-08T22:36:13.5932739Z + echo ' sudo: setrlimit(RLIMIT_STACK): Operation not permitted' 2024-08-08T22:36:13.5933382Z sudo: setrlimit(RLIMIT_STACK): Operation not permitted 2024-08-08T22:36:13.5934196Z + echo 'For more details refer to https://github.com/sudo-project/sudo/issues/42' 2024-08-08T22:36:13.5935073Z For more details refer to https://github.com/sudo-project/sudo/issues/42 2024-08-08T22:36:13.5935779Z + sudo chown -R 1000 /var/lib/jenkins/workspace 2024-08-08T22:36:14.3365655Z ##[group]Run cat test/**/*_toprint.log || true 2024-08-08T22:36:14.3366179Z cat test/**/*_toprint.log || true 2024-08-08T22:36:14.3380362Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:14.3380861Z env: 2024-08-08T22:36:14.3381136Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:14.3381586Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:14.3382443Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:14.3383077Z ##[endgroup] 2024-08-08T22:36:14.3479605Z cat: 'test/**/*_toprint.log': No such file or directory 2024-08-08T22:36:14.3511449Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2024-08-08T22:36:14.3512177Z kill "$MONITOR_SCRIPT_PID" 2024-08-08T22:36:14.3521740Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:14.3522237Z env: 2024-08-08T22:36:14.3522513Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:14.3522960Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:14.3523681Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:14.3524330Z MONITOR_SCRIPT_PID: 250122 2024-08-08T22:36:14.3524687Z ##[endgroup] 2024-08-08T22:36:14.3720633Z Prepare all required actions 2024-08-08T22:36:14.3721079Z Getting action download info 2024-08-08T22:36:14.5885229Z Download action repository 'actions/upload-artifact@v3' (SHA:a8a3f3ad30e3422c9c7b888a15615d19a852ae32) 2024-08-08T22:36:14.7542883Z ##[group]Run ./.github/actions/upload-test-artifacts 2024-08-08T22:36:14.7543340Z with: 2024-08-08T22:36:14.7543862Z file-suffix: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350 2024-08-08T22:36:14.7544513Z s3-bucket: gha-artifacts 2024-08-08T22:36:14.7544841Z env: 2024-08-08T22:36:14.7545116Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:14.7545559Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:14.7546268Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:14.7546889Z ##[endgroup] 2024-08-08T22:36:14.7572897Z ##[group]Run # Remove any previous test jsons if they exist 2024-08-08T22:36:14.7573533Z # Remove any previous test jsons if they exist 2024-08-08T22:36:14.7574031Z rm -f test-jsons-*.zip 2024-08-08T22:36:14.7574558Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json' 2024-08-08T22:36:14.7584366Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:14.7584902Z env: 2024-08-08T22:36:14.7585183Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:14.7585632Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:14.7586344Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:14.7587214Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350 2024-08-08T22:36:14.7587832Z ##[endgroup] 2024-08-08T22:36:14.7874662Z adding: test/allowlist_for_publicAPI.json (deflated 79%) 2024-08-08T22:36:14.7904132Z adding: test/benchmark_utils/callgrind_artifacts.json (deflated 92%) 2024-08-08T22:36:14.7904993Z adding: test/minioptest_failures_dict.json (deflated 70%) 2024-08-08T22:36:14.7910971Z adding: test/profiler/profiler_utils_mock_events.json (deflated 87%) 2024-08-08T22:36:14.7914539Z adding: test/test-reports/td_exclusions-b3925aafbef5bcdb51f7.json (deflated 81%) 2024-08-08T22:36:14.7915751Z adding: test/test-reports/td_exclusions-38eb40ca0adb27e4e111.json (deflated 38%) 2024-08-08T22:36:14.7916692Z adding: test/test-reports/td_exclusions-b97db7820b471cec34b5.json (deflated 18%) 2024-08-08T22:36:14.7917479Z adding: test/.pytorch-slow-tests.json (deflated 67%) 2024-08-08T22:36:14.7929126Z adding: test/.pytorch-disabled-tests.json (deflated 89%) 2024-08-08T22:36:14.7961318Z ##[group]Run # Remove any previous test reports if they exist 2024-08-08T22:36:14.7961964Z # Remove any previous test reports if they exist 2024-08-08T22:36:14.7962478Z rm -f test-reports-*.zip 2024-08-08T22:36:14.7963060Z zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml' -i '*.csv' 2024-08-08T22:36:14.7971992Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:14.7972498Z env: 2024-08-08T22:36:14.7972779Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:14.7973229Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:14.7973946Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:14.7974853Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350 2024-08-08T22:36:14.7975474Z ##[endgroup] 2024-08-08T22:36:14.8253126Z adding: test/custom_backend/test-reports/python-unittest/test_custom_backend/TEST-TestCustomBackend-20240808223601.xml (deflated 56%) 2024-08-08T22:36:14.8254855Z adding: test/custom_operator/test-reports/python-unittest/test_custom_ops/TEST-TestCustomOperators-20240808223555.xml (deflated 73%) 2024-08-08T22:36:14.8256238Z adding: test/test-reports/python-pytest/test_nestedtensor/test_nestedtensor-fd594a0b518044ae.xml (deflated 29%) 2024-08-08T22:36:14.8280977Z adding: test/test-reports/python-pytest/test_nestedtensor/test_nestedtensor-e492dd563caa7c17.xml (deflated 96%) 2024-08-08T22:36:14.8282177Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-7f26cf570014a38b.xml (deflated 28%) 2024-08-08T22:36:14.8283296Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-53d8b8e4742b0aa6.xml (deflated 28%) 2024-08-08T22:36:14.8284404Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-e77f26ae3368aa95.xml (deflated 28%) 2024-08-08T22:36:14.8292740Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-dd9d012de50495e6.xml (deflated 92%) 2024-08-08T22:36:14.8303281Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-f2d38ee8e2830ebf.xml (deflated 92%) 2024-08-08T22:36:14.8314203Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-15428101cb699235.xml (deflated 92%) 2024-08-08T22:36:14.8325871Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-2330f5b1bb70a582.xml (deflated 92%) 2024-08-08T22:36:14.8327006Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-291178fec05591b3.xml (deflated 28%) 2024-08-08T22:36:14.8328381Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-8a1870a79496e63e.xml (deflated 28%) 2024-08-08T22:36:14.8329966Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-f060d616d13d0293.xml (deflated 28%) 2024-08-08T22:36:14.8331557Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-6aef962aa03789c4.xml (deflated 28%) 2024-08-08T22:36:14.8333320Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-23cb3f1195577192.xml (deflated 93%) 2024-08-08T22:36:14.8337831Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-5e274951a1a51cae.xml (deflated 93%) 2024-08-08T22:36:14.8343374Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-9b4fc5842fed16bc.xml (deflated 93%) 2024-08-08T22:36:14.8344864Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-19c5a19a1ce16cb2.xml (deflated 28%) 2024-08-08T22:36:14.8380275Z adding: test/test-reports/python-pytest/functorch.test_ops/functorch.test_ops-0c8b282393ada1b5.xml (deflated 93%) 2024-08-08T22:36:14.8381656Z adding: test/test-reports/python-pytest/functorch.test_aotdispatch/functorch.test_aotdispatch-1cd953e00430667b.xml (deflated 28%) 2024-08-08T22:36:14.8396132Z adding: test/test-reports/python-pytest/functorch.test_aotdispatch/functorch.test_aotdispatch-5796e4e49b2f5fb0.xml (deflated 94%) 2024-08-08T22:36:14.8397590Z adding: test/test-reports/python-pytest/inductor.test_cudacodecache/inductor.test_cudacodecache-17e42fd46de89204.xml (deflated 29%) 2024-08-08T22:36:14.8399052Z adding: test/test-reports/python-pytest/inductor.test_cudacodecache/inductor.test_cudacodecache-3fc92dd20beb2933.xml (deflated 57%) 2024-08-08T22:36:14.8400525Z adding: test/test-reports/python-pytest/export.test_retraceability/export.test_retraceability-a3d5d9860697fd20.xml (deflated 28%) 2024-08-08T22:36:14.8407055Z adding: test/test-reports/python-pytest/export.test_retraceability/export.test_retraceability-2a25e2307290878f.xml (deflated 89%) 2024-08-08T22:36:14.8408465Z adding: test/test-reports/python-pytest/test_functionalization/test_functionalization-1c9ced085c948f70.xml (deflated 28%) 2024-08-08T22:36:14.8411426Z adding: test/test-reports/python-pytest/test_functionalization/test_functionalization-33b0571dadb3473a.xml (deflated 89%) 2024-08-08T22:36:14.8412637Z adding: test/test-reports/python-pytest/test_ops/test_ops-577f499e42f9c1c6.xml (deflated 28%) 2024-08-08T22:36:14.8413702Z adding: test/test-reports/python-pytest/test_ops/test_ops-94bb10f3ec113de2.xml (deflated 28%) 2024-08-08T22:36:14.8414758Z adding: test/test-reports/python-pytest/test_ops/test_ops-99425cedd22b28f1.xml (deflated 28%) 2024-08-08T22:36:14.8541351Z adding: test/test-reports/python-pytest/test_ops/test_ops-1e601ed5a626fb34.xml (deflated 97%) 2024-08-08T22:36:14.8656845Z adding: test/test-reports/python-pytest/test_ops/test_ops-b0e1630e7f66c768.xml (deflated 97%) 2024-08-08T22:36:14.8772556Z adding: test/test-reports/python-pytest/test_ops/test_ops-4ac194c87e944e9a.xml (deflated 97%) 2024-08-08T22:36:14.8773819Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-6a939d62ef14f3d2.xml (deflated 28%) 2024-08-08T22:36:14.8775313Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-18c03c8d4be1dc12.xml (deflated 28%) 2024-08-08T22:36:14.8776746Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-6d792feca55cc9a3.xml (deflated 28%) 2024-08-08T22:36:14.8778198Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-9ee3e7e9abe6a3c3.xml (deflated 89%) 2024-08-08T22:36:14.8779634Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-ef9481da91af52d2.xml (deflated 88%) 2024-08-08T22:36:14.8781055Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-1b9316ebd13cce28.xml (deflated 91%) 2024-08-08T22:36:14.8783423Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-036e5e2aa2951390.xml (deflated 89%) 2024-08-08T22:36:14.8784856Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-54badce2f3d68f90.xml (deflated 37%) 2024-08-08T22:36:14.8786288Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-f60bbe16c1ada337.xml (deflated 91%) 2024-08-08T22:36:14.8787722Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-fc78df539eb126f2.xml (deflated 29%) 2024-08-08T22:36:14.8789295Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-301046bc9c4f8c99.xml (deflated 89%) 2024-08-08T22:36:14.8790719Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-29f6d08ef7270379.xml (deflated 45%) 2024-08-08T22:36:14.8792236Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-591d916419a4856e.xml (deflated 28%) 2024-08-08T22:36:14.8793653Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-c1412f5ee19cf13a.xml (deflated 36%) 2024-08-08T22:36:14.8795084Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-85bbd9fa06eef850.xml (deflated 28%) 2024-08-08T22:36:14.8796593Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-3a00a147339babba.xml (deflated 28%) 2024-08-08T22:36:14.8798179Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-048418f9311d77a9.xml (deflated 29%) 2024-08-08T22:36:14.8799740Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-f95a0a3b55bc9b8d.xml (deflated 28%) 2024-08-08T22:36:14.8801322Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-65eab73b5b96b2dd.xml (deflated 94%) 2024-08-08T22:36:14.8803841Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-c79c307479ec167c.xml (deflated 95%) 2024-08-08T22:36:14.8810454Z adding: test/test-reports/python-pytest/inductor.test_compiled_optimizers/inductor.test_compiled_optimizers-53a7fdb6a79ed8e2.xml (deflated 95%) 2024-08-08T22:36:14.8812009Z adding: test/test-reports/python-pytest/inductor.test_cuda_cpp_wrapper/inductor.test_cuda_cpp_wrapper-6b00fdd658e09851.xml (deflated 28%) 2024-08-08T22:36:14.8813718Z adding: test/test-reports/python-pytest/inductor.test_cuda_cpp_wrapper/inductor.test_cuda_cpp_wrapper-5ca0ebfdf29ce6e3.xml (deflated 85%) 2024-08-08T22:36:14.8815060Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-4378125074b4389d.xml (deflated 29%) 2024-08-08T22:36:14.8821842Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-3f438bc4418e0618.xml (deflated 86%) 2024-08-08T22:36:14.8822995Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-2c573435e93ec042.xml (deflated 29%) 2024-08-08T22:36:14.8824162Z adding: test/test-reports/python-pytest/test.run_test/test.run_test-ea7adcca63016e48.xml (deflated 58%) 2024-08-08T22:36:14.8856616Z ##[group]Run # Remove any previous usage logs if they exist 2024-08-08T22:36:14.8857231Z # Remove any previous usage logs if they exist 2024-08-08T22:36:14.8857717Z rm -f logs-*.zip 2024-08-08T22:36:14.8858377Z # this workflow is also run in bazel build test, but we dont generate usage reports for it 2024-08-08T22:36:14.8859153Z # so check to see if the file exists first 2024-08-08T22:36:14.8859642Z if [ -f 'usage_log.txt' ]; then 2024-08-08T22:36:14.8860168Z  zip "logs-${FILE_SUFFIX}.zip" 'usage_log.txt' 2024-08-08T22:36:14.8860649Z fi 2024-08-08T22:36:14.8860998Z if ls test/**/*.log 1> /dev/null 2>&1; then 2024-08-08T22:36:14.8861563Z  zip -r "logs-${FILE_SUFFIX}.zip" test -i '*.log' 2024-08-08T22:36:14.8862048Z fi 2024-08-08T22:36:14.8871132Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:14.8871757Z env: 2024-08-08T22:36:14.8872043Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:14.8872488Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:14.8873211Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:14.8874184Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350 2024-08-08T22:36:14.8874810Z ##[endgroup] 2024-08-08T22:36:14.8973585Z adding: usage_log.txt (deflated 92%) 2024-08-08T22:36:14.9280880Z adding: test/test-reports/test_nestedtensor_1.1_f8a9837d23968f84_.log (deflated 50%) 2024-08-08T22:36:14.9281847Z adding: test/test-reports/test_decomp_4.13_fc04e15fd9210c6b_.log (deflated 48%) 2024-08-08T22:36:14.9282743Z adding: test/test-reports/test_decomp_6.13_933edb735d04209e_.log (deflated 48%) 2024-08-08T22:36:14.9283631Z adding: test/test-reports/test_decomp_7.13_2a3e4902a82cefef_.log (deflated 48%) 2024-08-08T22:36:14.9284688Z adding: test/test-reports/inductor.test_torchinductor_opinfo_1.12_bf7ae5ad0edfb927_.log (deflated 52%) 2024-08-08T22:36:14.9285833Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.12_7e34231148571d6c_.log (deflated 52%) 2024-08-08T22:36:14.9286942Z adding: test/test-reports/inductor.test_torchinductor_opinfo_9.12_1b32c87723b83937_.log (deflated 52%) 2024-08-08T22:36:14.9287993Z adding: test/test-reports/functorch.test_ops_4.6_3d5e5fa8cb52aace_.log (deflated 50%) 2024-08-08T22:36:14.9289020Z adding: test/test-reports/functorch.test_aotdispatch_1.1_aea2f41b0ae17148_.log (deflated 52%) 2024-08-08T22:36:14.9290081Z adding: test/test-reports/inductor.test_cudacodecache_1.1_0a92d31f2f536eb8_.log (deflated 51%) 2024-08-08T22:36:14.9291122Z adding: test/test-reports/export.test_retraceability_1.1_5d8ce0917ca3e98e_.log (deflated 59%) 2024-08-08T22:36:14.9292139Z adding: test/test-reports/test_functionalization_1.1_f3c5c5183a648b1d_.log (deflated 50%) 2024-08-08T22:36:14.9293221Z adding: test/test-reports/test_ops_1.8_efc9e26a3bf1c507_.log (deflated 49%) 2024-08-08T22:36:14.9294062Z adding: test/test-reports/test_ops_6.8_30418cf82e4b3793_.log (deflated 49%) 2024-08-08T22:36:14.9294933Z adding: test/test-reports/test_ops_7.8_e7d2144b325fb767_.log (deflated 49%) 2024-08-08T22:36:14.9295914Z adding: test/test-reports/inductor.test_aot_inductor_3.16_fe15845382a81ebe_.log (deflated 51%) 2024-08-08T22:36:14.9297900Z adding: test/test-reports/inductor.test_aot_inductor_7.16_b87b9e27d29daf20_.log (deflated 51%) 2024-08-08T22:36:14.9298977Z adding: test/test-reports/inductor.test_aot_inductor_9.16_d9a00153b99de1d3_.log (deflated 51%) 2024-08-08T22:36:14.9300069Z adding: test/test-reports/inductor.test_compiled_optimizers_2.4_f18894bedfde7652_.log (deflated 52%) 2024-08-08T22:36:14.9301202Z adding: test/test-reports/inductor.test_compiled_optimizers_3.4_07003ecdaacb8e39_.log (deflated 51%) 2024-08-08T22:36:14.9302303Z adding: test/test-reports/inductor.test_compiled_optimizers_4.4_ef323a71a4a2fb79_.log (deflated 51%) 2024-08-08T22:36:14.9303410Z adding: test/test-reports/inductor.test_cuda_cpp_wrapper_3.4_452a71924e6bd520_.log (deflated 51%) 2024-08-08T22:36:14.9304463Z adding: test/test-reports/inductor.test_cpu_cpp_wrapper_1.1_78914abce923ead5_.log (stored 0%) 2024-08-08T22:36:14.9330378Z adding: test/test-reports/test_nestedtensor_1.1_343f16bee441ba94_.log (deflated 95%) 2024-08-08T22:36:14.9347169Z adding: test/test-reports/test_decomp_7.13_9de2549ae9edc5c0_.log (deflated 90%) 2024-08-08T22:36:14.9364436Z adding: test/test-reports/test_decomp_4.13_e03c060950310a19_.log (deflated 90%) 2024-08-08T22:36:14.9393454Z adding: test/test-reports/test_decomp_6.13_e203bdf501237379_.log (deflated 90%) 2024-08-08T22:36:14.9403416Z adding: test/test-reports/inductor.test_torchinductor_opinfo_1.12_17d5a0bf28fd4858_.log (deflated 91%) 2024-08-08T22:36:14.9411070Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.12_44646b0001d84854_.log (deflated 91%) 2024-08-08T22:36:14.9422952Z adding: test/test-reports/functorch.test_aotdispatch_1.1_d399b37cd9957e26_.log (deflated 92%) 2024-08-08T22:36:14.9424020Z adding: test/test-reports/inductor.test_cudacodecache_1.1_dbc14e90da05020b_.log (deflated 60%) 2024-08-08T22:36:14.9447857Z adding: test/test-reports/export.test_retraceability_1.1_b8c2771bf028c6ff_.log (deflated 92%) 2024-08-08T22:36:14.9450490Z adding: test/test-reports/test_functionalization_1.1_af9b31041f945614_.log (deflated 89%) 2024-08-08T22:36:14.9458989Z adding: test/test-reports/inductor.test_torchinductor_opinfo_9.12_bc50289be67487d1_.log (deflated 91%) 2024-08-08T22:36:14.9503603Z adding: test/test-reports/functorch.test_ops_4.6_c3430c356a52daf4_.log (deflated 92%) 2024-08-08T22:36:14.9606171Z adding: test/test-reports/test_ops_1.8_4e641004e02454ef_.log (deflated 92%) 2024-08-08T22:36:14.9705161Z adding: test/test-reports/test_ops_6.8_6aa93b7ba880065c_.log (deflated 92%) 2024-08-08T22:36:14.9805691Z adding: test/test-reports/test_ops_7.8_285c7eb3b89094cc_.log (deflated 92%) 2024-08-08T22:36:14.9809330Z adding: test/test-reports/inductor.test_aot_inductor_3.16_0d0a4d0bf708ed2b_.log (deflated 92%) 2024-08-08T22:36:14.9815274Z adding: test/test-reports/inductor.test_aot_inductor_9.16_f828ddeda7877015_.log (deflated 91%) 2024-08-08T22:36:14.9828693Z adding: test/test-reports/inductor.test_aot_inductor_7.16_456039372a27472d_.log (deflated 92%) 2024-08-08T22:36:14.9833247Z adding: test/test-reports/inductor.test_compiled_optimizers_2.4_b43bac4adf31f447_.log (deflated 91%) 2024-08-08T22:36:14.9838492Z adding: test/test-reports/inductor.test_compiled_optimizers_3.4_0f5b5bc62f9afca6_.log (deflated 91%) 2024-08-08T22:36:14.9839576Z adding: test/test-reports/inductor.test_cpu_cpp_wrapper_1.1_f23bf753e2964f7d_.log (stored 0%) 2024-08-08T22:36:14.9843607Z adding: test/test-reports/inductor.test_compiled_optimizers_4.4_a9993a0dd11ff1df_.log (deflated 92%) 2024-08-08T22:36:14.9845269Z adding: test/test-reports/inductor.test_cuda_cpp_wrapper_3.4_c69b2b2988173fe1_.log (deflated 85%) 2024-08-08T22:36:14.9846457Z adding: test/test-reports/cpp.test_jit_1.1_c290a18a0a023e8e_.log (deflated 48%) 2024-08-08T22:36:14.9864816Z adding: test/test-reports/cpp.test_jit_1.1_d1ef5241d4903eda_.log (deflated 93%) 2024-08-08T22:36:14.9865845Z adding: test/test-reports/cpp.test_mobile_nnc_1.1_fc078f2c7b1cc7f8_.log (deflated 48%) 2024-08-08T22:36:14.9866841Z adding: test/test-reports/cpp.test_mobile_nnc_1.1_cd31fac5c9fbda4a_.log (deflated 75%) 2024-08-08T22:36:14.9901010Z ##[group]Run # Remove any previous debugging artifacts if they exist 2024-08-08T22:36:14.9901722Z # Remove any previous debugging artifacts if they exist 2024-08-08T22:36:14.9902262Z rm -f debug-*.zip 2024-08-08T22:36:14.9902643Z if [ -d 'test/debug' ]; then 2024-08-08T22:36:14.9903135Z  zip -r "debug-${FILE_SUFFIX}.zip" test/debug 2024-08-08T22:36:14.9903599Z fi 2024-08-08T22:36:14.9913610Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:14.9914105Z env: 2024-08-08T22:36:14.9914386Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:14.9914835Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:14.9915855Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:14.9916725Z FILE_SUFFIX: test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350 2024-08-08T22:36:14.9917347Z ##[endgroup] 2024-08-08T22:36:15.0017533Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-08T22:36:15.0017989Z with: 2024-08-08T22:36:15.0018263Z s3-bucket: gha-artifacts 2024-08-08T22:36:15.0018697Z s3-prefix: pytorch/pytorch/10308807531/1/artifact 2024-08-08T22:36:15.0019170Z retention-days: 14 2024-08-08T22:36:15.0019499Z if-no-files-found: warn 2024-08-08T22:36:15.0019858Z path: test-jsons-*.zip 2024-08-08T22:36:15.0020201Z name: artifact 2024-08-08T22:36:15.0020499Z region: us-east-1 2024-08-08T22:36:15.0020802Z env: 2024-08-08T22:36:15.0021079Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:15.0021523Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:15.0022251Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:15.0022883Z ##[endgroup] 2024-08-08T22:36:15.3579159Z NOTE: s3-prefix specified, ignoring name parameter 2024-08-08T22:36:15.3581486Z With the provided path, there will be 1 file uploaded 2024-08-08T22:36:15.3582101Z Uploading to s3 prefix: pytorch/pytorch/10308807531/1/artifact 2024-08-08T22:36:15.3637687Z Starting upload of test-jsons-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350.zip 2024-08-08T22:36:15.4601539Z Finished upload of test-jsons-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350.zip 2024-08-08T22:36:15.5319831Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-08T22:36:15.5320267Z with: 2024-08-08T22:36:15.5320544Z s3-bucket: gha-artifacts 2024-08-08T22:36:15.5320966Z s3-prefix: pytorch/pytorch/10308807531/1/artifact 2024-08-08T22:36:15.5321425Z retention-days: 14 2024-08-08T22:36:15.5321751Z if-no-files-found: error 2024-08-08T22:36:15.5322108Z path: test-reports-*.zip 2024-08-08T22:36:15.5322446Z name: artifact 2024-08-08T22:36:15.5322743Z region: us-east-1 2024-08-08T22:36:15.5323037Z env: 2024-08-08T22:36:15.5323300Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:15.5323748Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:15.5324478Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:15.5325095Z ##[endgroup] 2024-08-08T22:36:15.8892935Z NOTE: s3-prefix specified, ignoring name parameter 2024-08-08T22:36:15.8893532Z With the provided path, there will be 1 file uploaded 2024-08-08T22:36:15.8894142Z Uploading to s3 prefix: pytorch/pytorch/10308807531/1/artifact 2024-08-08T22:36:15.8950863Z Starting upload of test-reports-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350.zip 2024-08-08T22:36:16.0771409Z Finished upload of test-reports-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350.zip 2024-08-08T22:36:16.1058638Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-08T22:36:16.1059069Z with: 2024-08-08T22:36:16.1059345Z s3-bucket: gha-artifacts 2024-08-08T22:36:16.1059774Z s3-prefix: pytorch/pytorch/10308807531/1/artifact 2024-08-08T22:36:16.1060240Z retention-days: 14 2024-08-08T22:36:16.1060570Z if-no-files-found: ignore 2024-08-08T22:36:16.1060919Z path: logs-*.zip 2024-08-08T22:36:16.1061428Z name: artifact 2024-08-08T22:36:16.1061729Z region: us-east-1 2024-08-08T22:36:16.1062027Z env: 2024-08-08T22:36:16.1062287Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:16.1062732Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:16.1063451Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:16.1064067Z ##[endgroup] 2024-08-08T22:36:16.4698166Z NOTE: s3-prefix specified, ignoring name parameter 2024-08-08T22:36:16.4698901Z With the provided path, there will be 1 file uploaded 2024-08-08T22:36:16.4699668Z Uploading to s3 prefix: pytorch/pytorch/10308807531/1/artifact 2024-08-08T22:36:16.4755630Z Starting upload of logs-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350.zip 2024-08-08T22:36:16.6312034Z Finished upload of logs-test-default-2-5-amz2023.linux.g5.4xlarge.nvidia.gpu_28538801350.zip 2024-08-08T22:36:16.6600668Z ##[group]Run seemethere/upload-artifact-s3@v5 2024-08-08T22:36:16.6601101Z with: 2024-08-08T22:36:16.6601399Z s3-bucket: gha-artifacts 2024-08-08T22:36:16.6601833Z s3-prefix: pytorch/pytorch/10308807531/1/artifact 2024-08-08T22:36:16.6602297Z retention-days: 14 2024-08-08T22:36:16.6602634Z if-no-files-found: ignore 2024-08-08T22:36:16.6602989Z path: debug-*.zip 2024-08-08T22:36:16.6603289Z name: artifact 2024-08-08T22:36:16.6603586Z region: us-east-1 2024-08-08T22:36:16.6603880Z env: 2024-08-08T22:36:16.6604145Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:16.6604590Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:16.6605317Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:16.6605937Z ##[endgroup] 2024-08-08T22:36:17.0126949Z No files were found with the provided path: debug-*.zip. No artifacts will be uploaded. 2024-08-08T22:36:17.0415803Z ##[group]Run # shellcheck disable=SC2156 2024-08-08T22:36:17.0416409Z # shellcheck disable=SC2156 2024-08-08T22:36:17.0417188Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2024-08-08T22:36:17.0426506Z shell: /usr/bin/bash -e {0} 2024-08-08T22:36:17.0426858Z env: 2024-08-08T22:36:17.0427131Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:17.0427569Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:17.0428286Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:17.0428905Z ##[endgroup] 2024-08-08T22:36:17.3050774Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2024-08-08T22:36:17.3051314Z with: 2024-08-08T22:36:17.3051563Z env: 2024-08-08T22:36:17.3051832Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:17.3052269Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:17.3052983Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:17.3053609Z ##[endgroup] 2024-08-08T22:36:17.3072798Z ##[group]Run set -eou pipefail 2024-08-08T22:36:17.3073199Z set -eou pipefail 2024-08-08T22:36:17.3073538Z  2024-08-08T22:36:17.3074036Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2024-08-08T22:36:17.3074651Z for _ in $(seq 1440); do 2024-08-08T22:36:17.3075112Z  # Break if no ssh session exists anymore 2024-08-08T22:36:17.3075623Z  if [ "$(who)" = "" ]; then 2024-08-08T22:36:17.3076038Z  break 2024-08-08T22:36:17.3076483Z  fi 2024-08-08T22:36:17.3076769Z  echo "." 2024-08-08T22:36:17.3077087Z  sleep 5 2024-08-08T22:36:17.3077392Z done 2024-08-08T22:36:17.3086379Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:17.3086867Z env: 2024-08-08T22:36:17.3087146Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:17.3087587Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:17.3088302Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:17.3088922Z ##[endgroup] 2024-08-08T22:36:17.3122604Z Holding runner for 2 hours until all ssh sessions have logged out 2024-08-08T22:36:17.3178127Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2024-08-08T22:36:17.3178876Z # ignore expansion of "docker ps -q" since it could be empty 2024-08-08T22:36:17.3179457Z # shellcheck disable=SC2046 2024-08-08T22:36:17.3179913Z docker stop $(docker ps -q) || true 2024-08-08T22:36:17.3180381Z # Prune all of the docker images 2024-08-08T22:36:17.3180825Z docker system prune -af 2024-08-08T22:36:17.3189043Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:17.3189528Z env: 2024-08-08T22:36:17.3189803Z GIT_DEFAULT_BRANCH: main 2024-08-08T22:36:17.3190248Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2024-08-08T22:36:17.3190961Z DOCKER_CONTAINER_ID: 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:17.3191579Z ##[endgroup] 2024-08-08T22:36:18.1749014Z 44a07f2f048d 2024-08-08T22:36:23.4716086Z Deleted Containers: 2024-08-08T22:36:23.4716704Z 44a07f2f048d8a97b0be815c8ab5214b9e114fb02f813c68076187e82bd675c4 2024-08-08T22:36:23.4717162Z 2024-08-08T22:36:32.1301543Z Deleted Images: 2024-08-08T22:36:32.1303365Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9:02ec4fbd5adcb3fb91cf5ce431dec18b633de7d9 2024-08-08T22:36:32.1305918Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9@sha256:a898a1b237f3fdd8f48bb276749a44016150d2c25ae891805a3c665b34fc8003 2024-08-08T22:36:32.1307409Z deleted: sha256:cd235e6b1be7a0d3099ad6f7c05391c21669cd859ddf645fc846603ac8b56d9d 2024-08-08T22:36:32.1308241Z deleted: sha256:45e0976f575503941a10b3cba4a3ff590bfc203b4802c6de8c59b369d656724a 2024-08-08T22:36:32.1309569Z deleted: sha256:c61acbbfdbad22e2c88d71ca0d29b93d1fcce66c2ae345c9bb96a7dec302c41c 2024-08-08T22:36:32.1310664Z deleted: sha256:ff8a69cb9279e8a03a7762132f9250853a69284719cb945f8a35243187f47279 2024-08-08T22:36:32.1311490Z deleted: sha256:bfa68c5e2a3a3ede4c060e1f6d0d81ab96d1b00799acfae1f56f70b0568711c1 2024-08-08T22:36:32.1312409Z deleted: sha256:34d749c950e5c984c12e879e402a48e25d5b9679a2f76e39d1918cb44ae6a71a 2024-08-08T22:36:32.1313215Z deleted: sha256:d0e25776a85880ce3302ae6db212459666aab48f1334835d87f21de012907a02 2024-08-08T22:36:32.1314267Z deleted: sha256:56f4028ec8948e5bbe1fc78bb1d9be7b0d23ca11d3104bde7f0282fc20e1c127 2024-08-08T22:36:32.1315260Z deleted: sha256:0c794ed2194f5103e859213866ed9909f09810fbf10ce2dbc14c002c01e69f93 2024-08-08T22:36:32.1316086Z deleted: sha256:8402d1b515f17b0dce83caf95ffcb31956a9072bbfb00fb391c2fc2e898ba254 2024-08-08T22:36:32.1316909Z deleted: sha256:c5d33c900427aa286d93c976cf12d3cebdb2768378fff3885020764d25f3bc88 2024-08-08T22:36:32.1317725Z deleted: sha256:604d9b0bd0bbdd7a26f27307b4d8992f6d3b3adebc46b238527cf5f0bedef4c4 2024-08-08T22:36:32.1318548Z deleted: sha256:0cd1ea2753107d557974d29a4bb4730b677ca5573deaf385aec686604c46401f 2024-08-08T22:36:32.1319363Z deleted: sha256:0bac64609c89c877da39f47fd39acf4fa56f1f2a27a08bb46139e425d304ab60 2024-08-08T22:36:32.1320171Z deleted: sha256:4986487e5ed4b36a77cf0eeefc92c8905a6bace35b800b029313648631b6d03d 2024-08-08T22:36:32.1321044Z deleted: sha256:cc720cee203d1b11d91181db2cfb5107dd19bce5e6861f0ef6960c35a25be595 2024-08-08T22:36:32.1321878Z deleted: sha256:7f43efadde3b97ee2c002bbb560a974ba5f46dcccbba24e64163b8616ec796dd 2024-08-08T22:36:32.1322801Z deleted: sha256:ab135f617807e146f11acea7262e2b7608cde27b65c17e3cdde67ccea6b14992 2024-08-08T22:36:32.1323598Z deleted: sha256:5a989dc79511366cc92333e53994e4436738aa240bc3c47e3fbd56730675a2d8 2024-08-08T22:36:32.1324393Z deleted: sha256:a1df2af593e5055f5e2b08c143428588b2185c33be2e8fd9ffbf6550ffb87444 2024-08-08T22:36:32.1325228Z deleted: sha256:ca27bfb2f8cc8d54ae99babc01c9ed877b0f4f6e584a4f02e8b9e5d1d9c87180 2024-08-08T22:36:32.1326033Z deleted: sha256:228989acb158238200f18b4215b8b197b898b7a096119505fc0d3c102f50deee 2024-08-08T22:36:32.1326824Z deleted: sha256:1a9ef9dd08df2580c9727a19d02612712f4423d670cf3ef1987174a1b68614d5 2024-08-08T22:36:32.1327629Z deleted: sha256:20b511fb91716943dad5eeeae90ece21a8df93934737eb36de2c835a9d737ba2 2024-08-08T22:36:32.1328420Z deleted: sha256:8b7868b74d782bb44566be626004b12d6b7512a67161e2ef871a10110522ebb6 2024-08-08T22:36:32.1329212Z deleted: sha256:3a3302a77096f6f56e6f0fc8c8204f83c77da624a0a796a3535ee2e90027f96c 2024-08-08T22:36:32.1330031Z deleted: sha256:6d4f4af85672bbb276ce5e0907b4cfda1bfa97dfa165024fc074a62e4cc71879 2024-08-08T22:36:32.1330845Z deleted: sha256:37e2bcfdc59452203f4a07dac6a0698704c04bbfccbec59e231636aa47af55a3 2024-08-08T22:36:32.1331655Z deleted: sha256:dd486adc1f62b525884ef12618694faf9882f1886bf74146a1ad1e6e22807ef2 2024-08-08T22:36:32.1332451Z deleted: sha256:b270237b781e3f5f003e295c5a93cf7c773a2243a3fb7eb12269d33347136d3a 2024-08-08T22:36:32.1333255Z deleted: sha256:804b5c4dd7aad2c2fca3065276469b495240ab6b7e945e764fe8fdd14115dc86 2024-08-08T22:36:32.1334062Z deleted: sha256:904f81e9115e75fe6eebcbde4cc9ad7d882fc474e265c50e003a3d292007ff0e 2024-08-08T22:36:32.1334870Z deleted: sha256:29a7b758345afa399031b70583f145466e34ab5bfceb8f4e99d39db53ca3507c 2024-08-08T22:36:32.1335669Z deleted: sha256:f142ce97b4eb60c3413b3a29ec3a3320037802e7debb9a90b6b157dee153abdc 2024-08-08T22:36:32.1336457Z deleted: sha256:4b54555217f7f889b4f5c43f115720f45963b8be4183e4c87c3b9e813cbf3929 2024-08-08T22:36:32.1337267Z deleted: sha256:e60bedd55ab6a7963172852bc199c80c39f3a3ea23354c41d8fd8bc6b141154a 2024-08-08T22:36:32.1338070Z deleted: sha256:cf97738287b3a0c55b67b024ad13970ee424e0b62e946249ad6180b95fcd8bc6 2024-08-08T22:36:32.1338878Z deleted: sha256:9e982b48aec3a7dbde4854e72ea1c3b3051b7b72deb76ed8872702ee27e0edbc 2024-08-08T22:36:32.1339690Z deleted: sha256:0ba0b263f6f6d1d9d189c7290599cba72cba9b3cda829505862aae8cb769c84a 2024-08-08T22:36:32.1340580Z deleted: sha256:66e9dd839f589fbb532d8ec0a630969713f0cade1718b1180f7fe26a538f3be0 2024-08-08T22:36:32.1341391Z deleted: sha256:c65a350f82fd0839046ab6b440122fb4d6f37fd9eed4122e1af8b1753e14fac5 2024-08-08T22:36:32.1342216Z deleted: sha256:d69dc134f0f0e1cee4240d762c4d5691b8094ba6f5e59eca754334dece3e4915 2024-08-08T22:36:32.1343033Z deleted: sha256:ae76f208c51d40eb54554ea1364e9af019accdef034fde105faaf175c4bdbd71 2024-08-08T22:36:32.1343854Z deleted: sha256:858f0f9520afafd43dfeb4c993a8d9a616dc8f02d22747a1d3b5d7197f652245 2024-08-08T22:36:32.1344814Z deleted: sha256:861076dfb9595f35cc788c7994492af10057765f7e1144a58677a8bea20c0090 2024-08-08T22:36:32.1345609Z deleted: sha256:2d71d4b9defc3248e0657b644d8fa8746329ee53252128df6260bdac07a1e601 2024-08-08T22:36:32.1346658Z deleted: sha256:29004ba29a25af35c37b4fc76b65cdd1a6e10e3dbd5fd02ed2d24fb47f45b7d6 2024-08-08T22:36:32.1347651Z deleted: sha256:b1411e2373c105437b4d99814f459a19ee299ef18c3a57e6a86e24206d1f90b9 2024-08-08T22:36:32.1348451Z deleted: sha256:32cc529ff369f7121a4e6bc51cdb7090d9c47d98233037c279730828347723ac 2024-08-08T22:36:32.1349250Z deleted: sha256:7235b067158b7aca2be16b9d2dbae06b610267aaeb41be289c77f2b801fd5e8c 2024-08-08T22:36:32.1350059Z deleted: sha256:f000b1b348e11f95e2e91541f128ca3e6695a00742fe6e16215bc321f213362c 2024-08-08T22:36:32.1350857Z deleted: sha256:88e9dd01ce433a853ac999c98231d6d47b4dd3a03391dfd56d4ea80528388089 2024-08-08T22:36:32.1351678Z deleted: sha256:cbdf41f558fd321dd6bdae94b52dcd65e7d35d6d3d7d822bcc3c58a1bb22872b 2024-08-08T22:36:32.1352610Z deleted: sha256:ac87fc04ff8d229dcfcb5d951627b7914cbc33c740161fc658608a75b41fd40a 2024-08-08T22:36:32.1353494Z deleted: sha256:9374806889af634777725ffcac8da3cec3fbada86d187ab4a81c59bad7e5b606 2024-08-08T22:36:32.1354316Z deleted: sha256:8512ea010bd0fbdceb7b671c3f6b2cced0fda4187080e2f1ff1c6e6037979dbb 2024-08-08T22:36:32.1355131Z deleted: sha256:14cbe137d139085b0f8df186eed7a8bbd9081b7c3b803e7b64365b0c3a2aee52 2024-08-08T22:36:32.1355947Z deleted: sha256:88ced829326eed4b4c235f0f806b6c686c1b2ef209dc319b7f03731c04bcf0d5 2024-08-08T22:36:32.1356772Z deleted: sha256:fa2f10a66c16fac3a2d5642d21f2cd69b0148111dd1c930dab83cb97ce6682cf 2024-08-08T22:36:32.1357578Z deleted: sha256:7f83eb696a42997925b1e29da8e78d8e2df08443db59a8585cdc72cd94c1c782 2024-08-08T22:36:32.1358372Z deleted: sha256:4dd8500484390881a49bd9ed71859da08379e801e124b7b57847742858857c68 2024-08-08T22:36:32.1359177Z deleted: sha256:f69f31dbdcdab61878bc9e8f2333afff7e73e3d07408f329c583f6d790c5ad07 2024-08-08T22:36:32.1359989Z deleted: sha256:9eb132174f81d00607875e3d95ac8be9c9fda5ed780de6df7efdacf14c33a568 2024-08-08T22:36:32.1360801Z deleted: sha256:24e7cb4074c44e2087d68312d0efafb74732b8999579a1c74bd9363e6495efb0 2024-08-08T22:36:32.1361624Z deleted: sha256:c6d54db7315b535608dbab74fbbf6679f7c0c228a45dac092f7ba7a3addc774f 2024-08-08T22:36:32.1362453Z deleted: sha256:feee1b527542c64f2f2ddfb22b32aea7bb209d132d82d50ea65b46829c26b31e 2024-08-08T22:36:32.1363280Z deleted: sha256:92d2aed80e97a958da3271d1da4cb80fef8b9bd257e5939d6b7ffea0f99cbcc8 2024-08-08T22:36:32.1364103Z deleted: sha256:983e83d0d3bf5b974d81434cd3a3deb5975c077e72dc7134f9c70a7f601255cd 2024-08-08T22:36:32.1364909Z deleted: sha256:433544e4353c9e0269dc71fa16c00ddcecdcd5676ed0df618c7b51ac78e23053 2024-08-08T22:36:32.1365726Z deleted: sha256:fc24af9fd5121adfbc7b75df0abb1ef8c2d3868f33b7b6b1e8bf4e214fac4bd6 2024-08-08T22:36:32.1366544Z deleted: sha256:ce1492ad94b71204f8b3fb8903d0ad48a96652635e7fc9347e430639ca15a737 2024-08-08T22:36:32.1367335Z deleted: sha256:96f2ae9435e6e12f68a88dc404c8079b2796639589970c34894a86ccff3f3732 2024-08-08T22:36:32.1368138Z deleted: sha256:5310d2bc0952e7c6d8cbade7ecafd88ce3a04d9da303fa6419583e13a02b58a9 2024-08-08T22:36:32.1368945Z deleted: sha256:099076d608a55a027d60f5b572a3991a8d690a01500acf2c696933a7e6d38650 2024-08-08T22:36:32.1369737Z deleted: sha256:862b05d0f8f7e3dff2d04bd49784a9901f4e22c5dd6544ae72eb3be05e23fffe 2024-08-08T22:36:32.1370550Z deleted: sha256:f7a888e5f7904827f6d71cebcba5169dbc7a78b20f228b9b9b625f38e0c52f24 2024-08-08T22:36:32.1371416Z deleted: sha256:ab9f75a4a7a636aae480f630eaddc05133dda64f1c42445b617c97d3d209991c 2024-08-08T22:36:32.1372243Z deleted: sha256:56dda0c9cee64db5c10d5dd5f08c6f6707263f5ee98e8a075d635e5be996e855 2024-08-08T22:36:32.1373068Z deleted: sha256:5faf9c0a9efe4675ecd21a4ec417d51077d5e75da9e673161a94e7d6cd43f92c 2024-08-08T22:36:32.1373557Z 2024-08-08T22:36:32.1373698Z Total reclaimed space: 25.77GB 2024-08-08T22:36:32.1439415Z Post job cleanup. 2024-08-08T22:36:32.1489385Z Post job cleanup. 2024-08-08T22:36:32.2390795Z [command]/usr/bin/git version 2024-08-08T22:36:32.2442951Z git version 2.40.1 2024-08-08T22:36:32.2480995Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/6edc7cf4-8199-4f56-b443-d5a19eeb434c' before making global git config changes 2024-08-08T22:36:32.2482227Z Adding repository directory to the temporary git global config as a safe directory 2024-08-08T22:36:32.2486346Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2024-08-08T22:36:32.2536192Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2024-08-08T22:36:32.2583895Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2024-08-08T22:36:32.2940716Z Entering 'android/libs/fbjni' 2024-08-08T22:36:32.3003846Z Entering 'third_party/FP16' 2024-08-08T22:36:32.3070942Z Entering 'third_party/FXdiv' 2024-08-08T22:36:32.3144006Z Entering 'third_party/NNPACK' 2024-08-08T22:36:32.3213959Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-08T22:36:32.3283584Z Entering 'third_party/XNNPACK' 2024-08-08T22:36:32.3372150Z Entering 'third_party/benchmark' 2024-08-08T22:36:32.3442037Z Entering 'third_party/cpp-httplib' 2024-08-08T22:36:32.3509628Z Entering 'third_party/cpuinfo' 2024-08-08T22:36:32.3579638Z Entering 'third_party/cudnn_frontend' 2024-08-08T22:36:32.3649176Z Entering 'third_party/cutlass' 2024-08-08T22:36:32.3725846Z Entering 'third_party/eigen' 2024-08-08T22:36:32.3797156Z Entering 'third_party/fbgemm' 2024-08-08T22:36:32.3867772Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-08T22:36:32.3933979Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T22:36:32.3999402Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-08T22:36:32.4073016Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-08T22:36:32.4141903Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T22:36:32.4210444Z Entering 'third_party/flatbuffers' 2024-08-08T22:36:32.4283044Z Entering 'third_party/fmt' 2024-08-08T22:36:32.4351255Z Entering 'third_party/foxi' 2024-08-08T22:36:32.4419386Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-08T22:36:32.4487784Z Entering 'third_party/gloo' 2024-08-08T22:36:32.4563286Z Entering 'third_party/googletest' 2024-08-08T22:36:32.4631059Z Entering 'third_party/ideep' 2024-08-08T22:36:32.4696323Z Entering 'third_party/ideep/mkl-dnn' 2024-08-08T22:36:32.4771525Z Entering 'third_party/ittapi' 2024-08-08T22:36:32.4841304Z Entering 'third_party/kineto' 2024-08-08T22:36:32.4909485Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T22:36:32.4976214Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T22:36:32.5045233Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T22:36:32.5112565Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T22:36:32.5180420Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T22:36:32.5247259Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T22:36:32.5320137Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T22:36:32.5389140Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T22:36:32.5457632Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T22:36:32.5527719Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T22:36:32.5599468Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T22:36:32.5668050Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T22:36:32.5737045Z Entering 'third_party/mimalloc' 2024-08-08T22:36:32.5806131Z Entering 'third_party/nccl/nccl' 2024-08-08T22:36:32.5876447Z Entering 'third_party/nlohmann' 2024-08-08T22:36:32.5946997Z Entering 'third_party/onnx' 2024-08-08T22:36:32.6029271Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-08T22:36:32.6096827Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-08T22:36:32.6169520Z Entering 'third_party/opentelemetry-cpp' 2024-08-08T22:36:32.6240404Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T22:36:32.6305739Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T22:36:32.6371700Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T22:36:32.6437345Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T22:36:32.6504520Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T22:36:32.6570001Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T22:36:32.6636998Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T22:36:32.6699884Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T22:36:32.6769701Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T22:36:32.6839638Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T22:36:32.6928015Z Entering 'third_party/pocketfft' 2024-08-08T22:36:32.6997259Z Entering 'third_party/protobuf' 2024-08-08T22:36:32.7068965Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-08T22:36:32.7136433Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-08T22:36:32.7206559Z Entering 'third_party/psimd' 2024-08-08T22:36:32.7277856Z Entering 'third_party/pthreadpool' 2024-08-08T22:36:32.7349106Z Entering 'third_party/pybind11' 2024-08-08T22:36:32.7422245Z Entering 'third_party/python-peachpy' 2024-08-08T22:36:32.7489226Z Entering 'third_party/sleef' 2024-08-08T22:36:32.7556750Z Entering 'third_party/tensorpipe' 2024-08-08T22:36:32.7624666Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-08T22:36:32.7690542Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-08T22:36:32.7757441Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-08T22:36:32.7825662Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T22:36:32.7890627Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T22:36:32.7989936Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2024-08-08T22:36:32.8026825Z http.https://github.com/.extraheader 2024-08-08T22:36:32.8037002Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2024-08-08T22:36:32.8084495Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2024-08-08T22:36:32.8452992Z Entering 'android/libs/fbjni' 2024-08-08T22:36:32.8499477Z http.https://github.com/.extraheader 2024-08-08T22:36:32.8544026Z Entering 'third_party/FP16' 2024-08-08T22:36:32.8587802Z http.https://github.com/.extraheader 2024-08-08T22:36:32.8631516Z Entering 'third_party/FXdiv' 2024-08-08T22:36:32.8674613Z http.https://github.com/.extraheader 2024-08-08T22:36:32.8717981Z Entering 'third_party/NNPACK' 2024-08-08T22:36:32.8761172Z http.https://github.com/.extraheader 2024-08-08T22:36:32.8804179Z Entering 'third_party/VulkanMemoryAllocator' 2024-08-08T22:36:32.8847519Z http.https://github.com/.extraheader 2024-08-08T22:36:32.8890579Z Entering 'third_party/XNNPACK' 2024-08-08T22:36:32.8935866Z http.https://github.com/.extraheader 2024-08-08T22:36:32.8994832Z Entering 'third_party/benchmark' 2024-08-08T22:36:32.9039529Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9083572Z Entering 'third_party/cpp-httplib' 2024-08-08T22:36:32.9127346Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9172378Z Entering 'third_party/cpuinfo' 2024-08-08T22:36:32.9214581Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9259591Z Entering 'third_party/cudnn_frontend' 2024-08-08T22:36:32.9302407Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9345690Z Entering 'third_party/cutlass' 2024-08-08T22:36:32.9388546Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9439453Z Entering 'third_party/eigen' 2024-08-08T22:36:32.9481831Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9528155Z Entering 'third_party/fbgemm' 2024-08-08T22:36:32.9570873Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9613174Z Entering 'third_party/fbgemm/third_party/asmjit' 2024-08-08T22:36:32.9656077Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9698763Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2024-08-08T22:36:32.9746600Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9789629Z Entering 'third_party/fbgemm/third_party/cutlass' 2024-08-08T22:36:32.9833017Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9883101Z Entering 'third_party/fbgemm/third_party/googletest' 2024-08-08T22:36:32.9925680Z http.https://github.com/.extraheader 2024-08-08T22:36:32.9968123Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2024-08-08T22:36:33.0011077Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0057714Z Entering 'third_party/flatbuffers' 2024-08-08T22:36:33.0100399Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0145877Z Entering 'third_party/fmt' 2024-08-08T22:36:33.0188401Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0231012Z Entering 'third_party/foxi' 2024-08-08T22:36:33.0273645Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0316411Z Entering 'third_party/gemmlowp/gemmlowp' 2024-08-08T22:36:33.0361416Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0405787Z Entering 'third_party/gloo' 2024-08-08T22:36:33.0448933Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0492658Z Entering 'third_party/googletest' 2024-08-08T22:36:33.0535044Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0577628Z Entering 'third_party/ideep' 2024-08-08T22:36:33.0622667Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0664472Z Entering 'third_party/ideep/mkl-dnn' 2024-08-08T22:36:33.0706859Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0758372Z Entering 'third_party/ittapi' 2024-08-08T22:36:33.0802339Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0846069Z Entering 'third_party/kineto' 2024-08-08T22:36:33.0889425Z http.https://github.com/.extraheader 2024-08-08T22:36:33.0932122Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2024-08-08T22:36:33.0975198Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1019033Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2024-08-08T22:36:33.1063061Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1109191Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2024-08-08T22:36:33.1154648Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1199737Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2024-08-08T22:36:33.1243052Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1287742Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2024-08-08T22:36:33.1331796Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1375356Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2024-08-08T22:36:33.1419139Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1469039Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2024-08-08T22:36:33.1512823Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1557500Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2024-08-08T22:36:33.1599935Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1645422Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2024-08-08T22:36:33.1688417Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1735867Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2024-08-08T22:36:33.1780382Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1829096Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2024-08-08T22:36:33.1872296Z http.https://github.com/.extraheader 2024-08-08T22:36:33.1915348Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2024-08-08T22:36:33.1958614Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2004751Z Entering 'third_party/mimalloc' 2024-08-08T22:36:33.2051333Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2095091Z Entering 'third_party/nccl/nccl' 2024-08-08T22:36:33.2139472Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2183776Z Entering 'third_party/nlohmann' 2024-08-08T22:36:33.2227937Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2273106Z Entering 'third_party/onnx' 2024-08-08T22:36:33.2318256Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2375979Z Entering 'third_party/onnx/third_party/benchmark' 2024-08-08T22:36:33.2420514Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2464662Z Entering 'third_party/onnx/third_party/pybind11' 2024-08-08T22:36:33.2507702Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2556920Z Entering 'third_party/opentelemetry-cpp' 2024-08-08T22:36:33.2600372Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2645149Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2024-08-08T22:36:33.2687335Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2732548Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2024-08-08T22:36:33.2774910Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2819560Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2024-08-08T22:36:33.2861603Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2904555Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2024-08-08T22:36:33.2946920Z http.https://github.com/.extraheader 2024-08-08T22:36:33.2990467Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2024-08-08T22:36:33.3033320Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3075283Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2024-08-08T22:36:33.3122458Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3165650Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2024-08-08T22:36:33.3208494Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3250056Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2024-08-08T22:36:33.3292757Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3342403Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2024-08-08T22:36:33.3384119Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3431079Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2024-08-08T22:36:33.3478330Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3542730Z Entering 'third_party/pocketfft' 2024-08-08T22:36:33.3587000Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3630739Z Entering 'third_party/protobuf' 2024-08-08T22:36:33.3673636Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3719232Z Entering 'third_party/protobuf/third_party/benchmark' 2024-08-08T22:36:33.3762229Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3805755Z Entering 'third_party/protobuf/third_party/googletest' 2024-08-08T22:36:33.3849603Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3895545Z Entering 'third_party/psimd' 2024-08-08T22:36:33.3938877Z http.https://github.com/.extraheader 2024-08-08T22:36:33.3982313Z Entering 'third_party/pthreadpool' 2024-08-08T22:36:33.4025983Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4068488Z Entering 'third_party/pybind11' 2024-08-08T22:36:33.4110888Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4157939Z Entering 'third_party/python-peachpy' 2024-08-08T22:36:33.4201236Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4244721Z Entering 'third_party/sleef' 2024-08-08T22:36:33.4287681Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4330733Z Entering 'third_party/tensorpipe' 2024-08-08T22:36:33.4374556Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4416831Z Entering 'third_party/tensorpipe/third_party/googletest' 2024-08-08T22:36:33.4459685Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4502420Z Entering 'third_party/tensorpipe/third_party/libnop' 2024-08-08T22:36:33.4544915Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4586974Z Entering 'third_party/tensorpipe/third_party/libuv' 2024-08-08T22:36:33.4629856Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4673083Z Entering 'third_party/tensorpipe/third_party/pybind11' 2024-08-08T22:36:33.4714790Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4756749Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2024-08-08T22:36:33.4799194Z http.https://github.com/.extraheader 2024-08-08T22:36:33.4956125Z A job completed hook has been configured by the self-hosted runner administrator 2024-08-08T22:36:33.4976565Z ##[group]Run '/home/ec2-user/runner-scripts/after_job.sh' 2024-08-08T22:36:33.4984723Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2024-08-08T22:36:33.4985230Z ##[endgroup] 2024-08-08T22:36:40.3951038Z Cleaning up orphan processes